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Abstract: Many analyses of plurality-rule elections predict the complete coor- 
dination of strategic voting, and hence support for only two candidates. Here I 
suggest that stable multi-candidate support will arise in equilibrium. A group of 
voters must partially coordinate behind one of two challenging candidates in order 
to dislodge a disliked incumbent. In a departure from existing models, the popular 
support for each challenger is uncertain. This support must be inferred from the 
private observation of informative signals, such as the social communication of pref- 
erences throughout the electorate, or the imperfect observation of opinion polls. The 
uniquely stable voting equilibrium entails only limited strategic voting and hence 
incomplete coordination. This is due to the surprising presence of negative feed- 
back: an increase in the degree of strategic voting by others reduces the incentives 
for an individual to vote strategically. The incentive to vote strategically is lower in 
relatively marginal elections, after controlling for the distance from contention of a 
trailing preferred challenger. A calibration of the model applied to the UK General 
Election of 1997 is consistent with the impact of strategic voting and the reported 
accuracy of voters’ understanding of the electoral situation. It suggests that nearly 


50 seats may have been lost by the Conservative party due to strategic voting. 


1. PLURALITY-RULE ELECTIONS AND STRATEGIC VOTING 


1.1. Strategic Voting and Duverger’s Law. Plurality-rule elections, where the winner is 
the candidate who receives the largest number of votes, are vulnerable to strategic voting: a 
voter might well switch away from her preferred candidate and toward a perceived leader, in 
the hope of exerting a greater influence over the outcome of the election. This phenomenon 
has long been the focus of both economic and political scientific study. In an influential essay, 
Riker (1982b) cited Droop’s (1871) eloquent exposition of the strategic-voting problem: 


'T offer my grateful thanks to the many friends, colleagues, seminar participants, referees, and editors who 
have offered insightful comments during the long gestation of this paper, its antecedents, and companion 
work. Special thanks go to Steve Fisher, whose empirical analysis of strategic voting raised many of the 
questions that I seek to answer in this paper, and to Chris Wallace. This particular paper brings together 
much of the analysis previously circulated in the papers A New Theory of Strategic Voting, Strategic Voting 
Under the Qualified Majority Rule, and Idiosyncrasy, Information, and the Impact of Strategic Voting. 


“As success depends upon obtaining a majority of the aggregate votes of all 
the electors, an election is usually reduced to a contest between the two most 
popular candidates ... even if other candidates go to the poll, the electors 
usually find out that their votes will be thrown away, unless given in favor of 
one or other of the parties between whom the election really lies.” 


Droop’s observation incorporated two interesting ideas. First, voters may have instrumental 
motivations for their decisions; they may choose to vote in a way that best influences who 
wins the election.? Second, it is an early version of what is known to political scientists as 
Duverger’s law; this suggests that plurality-rule elections will tend to lead to a two-party 
system. Duverger (1954) envisaged an ongoing process in which political parties would react 
to the likely outcomes of plurality elections. His “psychological effect” of strategic voting was 
a key component of this process; however, he predicted only a tendency toward bipartism. 


Modern theories of strategic voting offer more-dramatic predictions. Palfrey (1989) and 
Myerson and Weber (1993) described formal models in which third-candidate support dis- 
appears: the equilibria of these models are strictly Duvergerian, in the sense that all voters 
fully coordinate on only two candidates.* Cox (1994) highlighted the existence of a second 
category of non-Duvergerian equilibria. They involve vote switching away from a leading 
candidate and toward a trailing candidate resulting in a tie for second place. Such a tie 
attenuates incentives to vote strategically, and hence can yield an equilibrium outcome. 


It is certainly the case that, under the plurality rule, two political parties often dominate. 
At the level of an electoral district, however, neither the strictly Duvergerian predictions 
of Palfrey (1989) and Myerson and Weber (1993), nor the non-Duvergerian equilibrium 
emphasized by Cox (1994) serve as stylized descriptions of observed behavior. In Section 6.3 
I argue that this is the case in English parliamentary constituencies. Here, however, the 1970 
New York senatorial election provides a classic illustration. This election was highlighted by 
Riker (1982a) and others as a failure of the coordination of strategic voting.” In a “three 
horse race” two liberal candidates, Richard L. Ottinger and Charles E. Goodell, competed 


2A succinct modern definition (Fisher 2001a) is that “a tactical voter is someone who votes for a party they 
believe is more likely to win than their preferred party, to best influence who wins in the constituency.” 
3Duverger (1954) introduced his law by noting that the “simple-majority single-ballot system favors the 
two-party system.” The term “simple majority” refers to the plurality rule, and the term “single ballot” 
indicates an environment with simple single-member districts. Riker (1982b), cited by Feddersen (1992), 
restated the law as asserting that “plurality election rules bring about and maintain two-party competition.” 
4Palfrey’s (1989) summary was that “with instrumentally rational voters and fulfilled expectations, multi- 
candidate contests under the plurality rule should result in only two candidates getting any votes.” While 
here I restrict attention to the plurality rule, Myerson and Weber (1993) succeeded in characterizing equilibria 
for a wider variety of electoral systems, and Cox (1994) analyzed elections with multi-member districts. 
’This example was also used, and to great effect, in the undergraduate text of Morton (2001), and Cox 
(1994) used it to motivate his discussion of non-Duvergerian voting equilibria. 


Candidate Votes Share 
James Buckley | 2,288,190 | 39% 
Charles Goodell | 1,434,472 | 24% 
Richard Ottinger | 2,171,232 | 37% 
Total 5,893,894 | 100% 


TABLE 1. The 1970 New York Senatorial Election 


against the conservative James R. Buckley.® I present the outcome of this election in Table 
1. A widely held belief was that the liberal vote was split between Goodell and Ottinger, 
allowing the win for Buckley. The outcome was certainly not Duvergerian. Neither was it 
non-Duvergerian in the sense of Cox (1994), since such equilibria require the challengers to 
tie for second place. 


1.2. Coordination and Common Knowledge. The 1970 New York senatorial election, 
and parliamentary elections in the UK and other countries, exhibit only partial coordination 
of strategic voting.’ In this paper, I suggest that such outcomes are consistent with all but 
one of the assumptions made by Palfrey (1989), Myerson and Weber (1993), and Cox (1994). 
My point of departure is this: I suppose that there is no common knowledge of the electoral 
situation, and that voters must learn of this situation via the private observation of infor- 
mative signals. Existing theories of voting under the plurality rule rest on the assumption 
that an individual votes for a particular candidate with some commonly known probability.® 
Equivalently, the election outcome is the realization of a multinomial random variable with 
commonly known parameters. This assumption is fundamental to existing results; I discuss 
this claim here, prior to describing the new contributions of this paper. 


To verify the claim, I begin by highlighting two important insights of existing theories. First, 


an instrumental voter cares only about situations in which two or more candidates tie for 


SMore specifically, Goodell was an incumbent Republican who had taken a liberal stance on the Vietnam 
War, and hence received the nomination of the Liberal Party. The New York Conservative Party, however, 
rather than nominating Goodell as a “fusion” candidate instead supported the conservative Buckley. 
‘Analysis of strategic voting in the UK was provided by Johnston and Pattie (1991), Lanoue and Bowler 
(1992), and Niemi, Whitten, and Franklin (1992), inter alios. New research, based on British Election Study 
data and analyzing intuitive predictions, the bimodality hypothesis of Cox (1994) and the predictions of this 
paper was reported by Myatt and Fisher (2003); see also Fisher (2000, 2001b). The bipartite prediction also 
fails in India. Riker (1976) offered an analysis of this case, and hypothesized that reduced strategic voting 
was due to the presence of a clear Condorcet winner, against whom strategic voting is futile. 

8In addition to the aforementioned papers, this is also true of more recent contribution, including papers by 
Myerson (1999, 2000, 2002) and Piketty (2000). 
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the lead; these are the only circumstances in which her vote really matters.? Second, any 


theory of strategic voting must involve an uncertain election outcome; absent any uncertainty, 


almost any pattern of voting behavior yields an equilibrium.!? 


When the situation is common knowledge, any uncertainties in individual decisions are 
averaged out in a large electorate. Hence, should a tie occur, it will almost always be 
between the two leading candidates; furthermore, the identities of these leaders are common 
knowledge. For almost all voters, it is optimal to support one of these leaders: a Duvergerian 


outcome.'! Moreover, the law of large numbers removes any real uncertainty. !? 


With this observation in hand, it is interesting to note that Droop’s (1871) logic can apply 
only when the identities of the parties “between whom the election really lies” are known to 
voters. Similarly, Cox (1997, p. 78) fully recognized that a “condition necessary to generate 
pure local bipartism is that the identity of trailing and front-running candidates is common 


knowledge.” The step taken in this paper is to remove this common knowledge, and assess 


the impact of voters’ information and beliefs on the strategic-voting phenomenon.!? 


1.3. Private Information and Negative Feedback. I turn now to the contributions of 
this paper. To focus on the central issues discussed above, I restrict to a class of three- 
candidate plurality-rule elections. One of the candidates, perhaps a disliked incumbent 
office-holder, enjoys some fixed, unwavering support. The remaining group of voters wish 
to dislodge this disliked incumbent.!4 To do so, a proportion y > 5 of the anti-incumbent 
voters must vote in favor of one of two challenging candidates. This generates a qualified 
majority voting game: a qualified majority (or super-majority) y of the dissatisfied voters 
must successfully coordinate if the incumbent is to be defeated. This scenario captures many 


%An “instrumental voter” cares only about her vote insofar as it influences the winner of the election. The 
importance of pivotal events was, of course, fully recognized in the earlier decision-theoretic analyses of 
McKelvey and Ordeshook (1972) and Hoffman (1982), as well as the game-theoretic treatments cited above. 
10Qsborne (1995, p. 293) offered a concise summary of this point. When the outcome is known, and a voter 
is not pivotal (there is no tie or near-tie for the lead), instrumental concerns give no guide to her. 

Cox (1994) cleverly constructed non-Duvergerian equilibria in the following way. Begin by ordering three 
candidates according to their popularity. From the supporters of the second candidate, a precise fraction 
switch away to the third candidate. This ensures that there is (almost) a tie for second place. More 
importantly (for the construction of an equilibrium) either of the second or third candidates may tie with 
the first candidate; this ambiguity prevents all support bleeding away to the leading pair. 

Many authors have observed that the specification of uncertainty is central to any theory of voting. Palfrey 
and Rosenthal (1985, p. 62), for instance, examined “the role played by uncertainty or the lack of information 
in affecting political processes.” They considered a voter-participation game with incomplete information 
over voting costs. However, the distribution of voting costs was assumed to be known, and individuals were 
drawn independently from it. For a large electorate, therefore, aggregate uncertainty was absent. 

!3The information and beliefs of voters concern empirical researchers: Heath and Evans (1994) criticized the 
Niemi et al (1992) measure of strategic voting by observing that it does not allow for the possibility that 
voters are mistaken in their perceptions of the likely chances of the various parties winning a constituency. 

ey hroughout the paper, I will refer to the disliked candidate as an “incumbent.” This terminology is adopted 
in order to ease the exposition; the current occupations of the candidates play no part in the analysis. 
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elements of modern plurality elections. In the 1997 UK General Election the incumbent (and 
unpopular) Conservative party polled between one-third and one-half of the votes cast for 
the three major parties in 270 out of 529 English constituencies. In more than half of 
England, therefore, anti-Conservative voters needed to successfully coordinate behind either 
the Labour candidate or the Liberal Democrat candidate in order to ensure a Tory defeat.1° 


This scenario is also a stylized description of the 1970 New York senatorial election (Table 1). 


Amongst the anti-incumbent voters, the fraction who prefer the first challenger to the sec- 
ond is unknown. Voters must infer this fraction from the private observation of informative 
signals. Such signals might reflect either the social communication of preferences through- 
out the electorate, or the imperfect observation of opinion polls. The private observation 
ensures that the electoral situation is never common knowledge. This leads to the somewhat- 
surprising phenomenon of negative feedback: when others in the electorate are expected to 
vote strategically, then an individual voter faces a reduced incentive to do so. 


The explanation for this negative feedback begins with an examination of the voting problem 
faced by an individual voter. Based on any private information available to her, such an 
individual will consider the relative likelihood of the different “pivotal outcomes” under 
which one of the challenging candidates ties with the disliked incumbent. Furthermore, the 
likelihood ratio of such events will be determined by her beliefs about the true underlying 
support for each of the challengers. If the likelihood ratio deviates away from unity, then 
the individual in question may face an incentive to vote strategically. 


With such likelihood ratios in mind, consider a benchmark scenario in which an individual 
voter expects all others to vote truthfully. A relatively large bias in underlying support 
toward the first challenger may be required to achieve the qualified majority y, and similarly 
a relatively large bias toward the second challenger may be needed for him to do the same. 
The voter compares the probabilities of these pivotal events which are relatively far apart 
and will have different probabilities of occurrence. Her incentive to vote strategically may 
then be large. 


Suppose instead that an individual voter anticipates that others are likely to vote strategically 
by responding strongly to their private signals. A relatively small bias in the true underlying 
support toward the first challenger is all that is required for him to hit the qualified majority 
of y: a small bias will lead to signals indicating the first challenger’s status as a leading 
contender, and hence will prompt net strategic switching away from the second challenger. 
This enhances the first challenger’s position, enabling him to reach the threshold y. Identical 


RAL hroughout the parliamentary constituencies of England, the three major political parties (Conservative, 
Labour, and Liberal Democrat) are the only effective competitors. The Conservative (equivalently, Tory) 
party is seen by many as right-wing; the other parties are seen as left-wing. Some may argue that the 
positions of the parties on the Downs-Black-Hotelling spectrum have changed since Labour’s rise to power. 
For that reason, I restrict attention to the 1997 election in Section 6.3. 
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logic shows that a relatively small bias in true support toward the second challenger is all 
that is required for him to do the same. The individual voter contemplates two situations 
involving relatively smaller biases. Such events are relatively close and hence will have 
similar probabilities of occurrence: the incentive to vote strategically is small. 


This argument contrasts with a tale of positive feedback. According to this, more familiar, 
logic, strategic voting reduces the support of the less popular challenger. This loss of support 
(and consequent gain for the leading challenger) enhances the incentive to vote strategically, 
and thus further erodes the support of the trailing challenger. This is the “bandwagon effect” 
of Simon (1954), and leads to the elimination of any third-party support. This bandwagon 
logic is somewhat flawed: when voters do not have a common understanding of the electoral 
situation, it will not be clear who the leading challenger is. To put it somewhat crudely, a 
voter becomes fearful that the bandwagon is rolling in a direction opposite to that indicated 
by her own private signal, and is more cautious about switching her vote. Anticipation of 
more switching by others (a speedier bandwagon) enhances this caution. 


1.4. Implications. The negative-feedback effect is somewhat subtle; nevertheless, its impli- 
cations are immediate. First, it implies that game-theoretic considerations (allowing voters 
to anticipate strategic switching by others) actually reduce the impact of strategic voting. 
Second, negative feedback leads away from fully coordinated strategic voting behavior and 
toward a uniquely stable equilibrium in which there is only partial coordination: both chal- 
lenging candidates enjoy positive support. Third, while the net effect of strategic voting is to 
help the leading challenger at the ballot box, there is also movement away from this candi- 
date; some voters, receiving erroneous signals of the challengers’ support levels, switch in the 
wrong direction. Fourth, the existence of a uniquely stable equilibrium with multi-candidate 
support permits comparative static exercises: I am able to ascertain the implications of 
changes in the electoral situation; of the idiosyncrasy of voters’ preferences; and of the qual- 
ity of their information sources. Furthermore, while I attempt no real empirical analysis, a 
calibration of the model applied to the UK General Election of 1997 resonates with observed 
behavior, and offers insights into the wider consequences of strategic voting. Finally, in pre- 
dicting only a tendency toward bipartism, this paper offers results in the spirit of Duverger’s 
(1954) original legislation. The paper represents, therefore, not so much brand new theory, 
but rather a careful attempt to build a more rigorous foundation for his psychological effect. 


1.5. Related Literature. Strategic voting has long been of interest to social scientists. 
The connections between this paper and the established literature are described throughout 
later sections; I offer only a few brief comments here.'® Formal studies began with decision- 
theoretic analyses, where voters do not take into account strategic vote-switching by others. 
Following Farquharson (1969), McKelvey and Ordeshook (1972) built upon work by Riker 


16For a thorough review, I point to Cox’s (1997) comprehensive survey of the literature. 
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and Ordeshook (1968); this analysis was extended by Hoffman (1982), Cox (1984) and Niemi 
(1984). Later studies were game-theoretic, and examined the Nash and Bayesian-Nash equi- 
libria of voting games. The aforementioned papers by Palfrey (1989), Myerson and Weber 
(1993) and Cox (1987, 1994) fall within this category. This paper builds upon all of this 
earlier work; as noted above my point of departure is the removal of the assumption that 
voters have common knowledge of the electoral situation: voters must use the information 
sources at their disposal to detect which strategic-voting game they are playing. 


The absence of common knowledge means that the model is a global game in the sense 
of Carlsson and van Damme (1993); it is (Morris and Shin 2003) a game “of incomplete 
information whose type space is determined by the players each observing a noisy signal 
of the underlying state.”'!” Here, the “underlying state” is the true relative popularity of 
the two challenging candidates. This determines the character of the game being played; in 
essence, the identity of the challenger best placed to beat the incumbent. The “noisy signal” 
corresponds to information such as social communication. In recent years, economists have 
made great progress by applying the techniques of global games to a wider range of problems. 
Applications include currency crises (Morris and Shin 1998), the pricing of debt (Morris 
and Shin 2004), and bank runs (Goldstein and Pauzner 2003); there are, of course, many 
others. There have been fewer applications to voting scenarios. Feddersen and Pesendorfer 
(1996, 1997, 1998) studied two-candidate models in which voters receive private information 
of an underlying state, and where an individual’s own payoffs depend upon the underlying 
state; for instance, members of a jury must decide whether to vote for the conviction of a 
defendant. Their main focus was upon the information-aggregation function of an election. 


1.6. A Guide to the Paper. I study a finite electorate in Section 2, and specify voters’ 
payoffs and information sources. In Section 3, I consider optimal voting behavior in large 
electorates, and use my results to define an appropriate limit-game with a unit mass of voters. 
I find a uniquely stable equilibrium to this game in Section 4, and analyze its comparative 
statics in Section 5. In Section 6, I consider further micro-foundations for the information 
sources of voters, before applying a calibration of the model to the 1997 UK General Election. 


!7Both Morris and Shin (2003) and Myatt, Shin, and Wallace (2002) offered discussions of this literature. 


2. STRATEGIC VOTING AS A QUALIFIED MAJORITY VOTING GAME 


2.1. The Strategic Voting Problem. A plurality-rule election is deceptively simple: each 
member of an electorate casts a single vote; the candidate enjoying the highest vote-total 
wins the election.'® Despite this simplicity, there is potential for voters strategically to switch 
in many directions; a full analysis of strategic voting is necessarily complex. 


In many situations, however, a stylized representation of the strategic-voting problem is more 
straightforward. In English parliamentary constituencies, for example, three major political 
parties compete. In 1997, the incumbent Conservative party faced challenges from the Liberal 
Democrat and Labour parties.'? The Conservative administration was unpopular, and hence 
many voters wished to see the defeat of Tory candidates. Such voters faced the following 
problem: should they vote for their favorite candidate, or switch to a second choice? A switch 
would make sense if a vote for the second choice was more likely to stop a Conservative win. 
This characterization of the 1997 election suggests an analysis of two-way switching between 
supporters of the Labour and Liberal Democrat parties. 


2.2. Qualified-Majority Voting. To model this stylized game of “beat the incumbent” I 
begin by fixing the votes cast for an otherwise-disliked incumbent candidate 7 = 0 at x9; the 
incumbent will win unless another candidate obtains more votes than this. A group of n+1 
anti-incumbent voters is indexed by 7 € {0,1,...,n}. Each individual 7 casts a single vote 
for one of two challenging candidates j € {1,2}; the votes cast for challenger j total x;, so 
that 7, +22 =n+1. Next, I assume that the votes cast for the incumbent satisfy 


1 
xo = [yn], where ee <1 


and where |yn] indicates the least integer that is weakly greater than yn. The restriction 
y¥ > 5 ensures that a qualified majority (or super-majority) of anti-incumbent voters must 
support one of the challengers in order to ensure the incumbent’s defeat; this generates 
an environment in which strategic voting might take place, and further implies that the 
incumbent candidate can never trail in third place.?? The restriction y < 1 ensures that 
sufficient coordination by anti-incumbent voters (for instance, when x; = n+ 1 for some 
j € {1,2}) will prevent an incumbent win; 7 indexes the exact degree of coordination 


18There may, of course, be a tie for the lead, and it is the (perhaps remote) possibility of such ties that will 
be of interest to any instrumentally motivated voter. In UK parliamentary elections, the returning officer 
enjoys a casting vote in the event of a tie, and breaks this tie with the aid of a fair coin. 

'9Prior to 1997, the Tories held overall power. They did not, however, hold every constituency. As in 
Section 1, the label “incumbent” is used in a loose fashion; the candidate so labelled is simply the one with 
fixed support, and against whom other voters wish to coordinate. 

20A qualified majority (equivalently, Black’s (1948) special majority) may be used to eliminate Condorcet 
cycles; see Simpson (1969), Kramer (1973, 1977), Greenberg (1979), and Caplin and Nalebuff (1988). 


required. Turning to the election outcome, and based upon the vote totals, the winner is 


0 if max{x1,2%2} < x, 
J=41 if, > 20, and 
2° iba > ap: 


Notice that the incumbent wins whenever 7) = 2; for j € {1,2}. This is without loss of 
generality; nothing would change if the winner were randomly chosen in the event of a tie. 


An inspection of Table 1 reveals that the 1970 New York senatorial election fits within 
this story: Buckley was the disliked incumbent, with x) = 2,288,190 votes. Votes cast for 
Goodell and Ottinger, corresponding to anti-incumbent liberals, totalled n + 1 = 3, 605, 704. 
The critical qualified-majority needed for a conservative defeat was then y = xo/n * 0.635. 


The assumptions that the incumbent candidate enjoys a fixed number of votes and that the 
remaining n+ 1 voters are “anti-incumbent” means that incumbent supporters play no part 
in the analysis that follows. For this reason, all references to voters, the constituency, and 
the electorate, will indicate the n + 1 anti-incumbents. 


2.3. Preferences. Payoffs are contingent only upon the identity of the winning candidate: 
voter 7 receives a payoff u;; when 7 wins the election. All n+1 voters strictly prefer both chal- 
lengers to the incumbent. This permits a normalization of ujo = 0, so that min{ui, ui2} > 0. 
Later analysis will confirm that the ratio of the payoffs u;, and uj2 is sufficient to describe a 
voter’s preferences; taking logarithms, I define tu; = log|u;1/uj2], so that the sign of u; deter- 
mines the identity of voter i’s preferred challenger, and the size |t;| determines the intensity 
of this first preference. The log relative-preference uw; is broken down into two components. 


Assumption 1. wu; 1s decomposed into common and idiosyncratic components, 


if Ui 
tui = log =| = 1+ Ei, 
Ui2 


where n is common to everyone, and €; ~ N(0,£€?) independently throughout the electorate. 


n = E|u;] is the expected log relative-preference across the population, conditional on the 
electoral situation. One interpretation is that 7 represents common factors affecting all 
voters, whereas €; represents the idiosyncratic preference of voter 7. A second is that 7 is the 
log relative-preference, and hence essential identity, of the median anti-incumbent voter. 


Following Assumption 1, the electoral situation may be described by the parameters y, 7 
and €. Alternatively, 7 may be inferred from the fraction 7 of the electorate who prefer the 
first challenger: 7 = Pr[u; > 0] = ®(7/&), where ®(-) is the cumulative distribution function 
of the standard normal distribution. This may be inverted to obtain 7 = €®~'(7). Finally, 
the specification for the distribution of ¢; permits a simple underlying foundation for the 
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signal technology developed below. It also offers the following convenience: €? provides a 
measure of the degree of preference idiosyncrasy throughout the electorate. 


2.4. Information and Social Communication. Voter 7 is assumed to know w;, but not 
its decomposition into 7 and ¢;.*" This means that the exact identity of the median anti- 
incumbent voter 7 (and hence the proportion 7 = ®(n/€) who prefer the first challenger) is 


unknown; it must be inferred from any information sources available to a voter.7? 


Voters begin with a common and diffuse prior over 7. Equivalently, prior to the receipt of 
any informative signals, they are ignorant of the electoral situation.”? Tighter beliefs are 
generated following the acquisition of information pertaining to this situation. This infor- 
mation is encapsulated in an informative signal 6;|7 ~ N(n, «?) of the common component 
n, with associated precision 1/K?. Before stating this specification as Assumption 2, I first 
consider two examples of the information sources that might be available to a voter. 


A first source of information is this: a voter may employ i; as an estimate of 7.24 This 
yields a signal 6; = a with variance Kk? = €?. Moving beyond simple introspection, a more 
precise signal may be obtained via a second source: the social communication of preferences 
throughout the electorate.”? To explore this idea, suppose that voter i observes the prefer- 
ences of m—1 randomly chosen members of the electorate, indexed by k.?° Augmenting these 
observations with her own log relative-preference t;, this sample (of total size m) allows her 
to estimate 7.2" Given the normality assumption, the sample mean is a sufficient statistic 


2lThis contrasts with the models of Feddersen and Pesendorfer (1996, 1997, 1998), in which “voters are 
uncertain about the realization of a state variable that affects the utility of all voters.” The distinction is 
that, in their work, the voter’s own payoff is unknown; this leads to their “swing voter’s curse.” 
*2Qualified-majority voting is thus a global game (Carlsson and van Damme 1993); recall (Section 1) that 
Morris and Shin’s (2003) definition is a game “of incomplete information whose type space is determined by 
the players each observing a noisy signal of the underlying state.” The state variable in this case is 7, and 
the noisy signal of the underlying state is given in Assumption 2. See also Myatt, Shin, and Wallace (2002). 
?3 An alternative would be to allow voters to be begin with a prior belief 7 ~ N(,07) and allow 0? > ov. All 
the results continue to hold with a non-diffuse prior so long as a is sufficiently large. Adopting a diffuse prior 
belief does not eliminate the possibility that voters have prior information stemming from earlier elections, or 
from current media or opinion-poll analysis. So long as such information is transmitted with some (perhaps 
small) noise then it may be viewed within the context of Assumption 2. 

24Tn a recent contribution Goeree and Grosser (2003, p. 2) noted that “people tend to use their own tastes 
and beliefs as information in guessing what others like and believe.” 

>Pattie and Johnston (1999) demonstrated that the contextual effects of conversations with family, acquain- 
tances and others were associated with vote-switching behavior in the UK General Election of 1992. 

*6This differs from the bounded-rationality sampling equilibrium described by Osborne and Rubinstein 
(2003), in which voters observe a sample of the actions taken by others. 

271 am implicitly assuming that she can elicit the true preferences of those within her sample rather than 
their stated preferences; sampled individuals might choose to misrepresent their preferences in order to 
strategically manipulate her beliefs. Of course, the voter would anticipate such manipulation and adjust 
accordingly. I side-step this issue by supposing that information acquisition occurs over a period of time 
prior to an election, when individuals in the community have little opportunity to hide their true colors. 
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for 7, generating an aggregate signal 
Lo. & ga 
i ~N{7,— = a 
ee 


Hence an informative signal, with precision m/€?, arises from a detailed “private opinion poll” 
of size m. This size would correspond to the number of people with whom an individual 
interacts, so long as such people are drawn at random from the population. 


Notice that the inclusion of u; within the sample mean of a voter’s social communication 
ensures that the her posterior beliefs depend only on the sufficient statistic 6;. I state the 
formal specification of a voter’s signal in terms of such a sufficient statistic. 


Assumption 2. Voter i privately observes a signal 6;|n ~ N(n,«?). Conditional on n, the 
private signal 6; and idiosyncratic component €; are joint normal. The signal encapsulates 
all payoff-relevant information for voter 1; formally it is a sufficient statistic for 7. 


The restriction of the signal to a single dimension is without loss of generality. If a voter were 
to observe a multi-dimensional normally distributed signal then 6; would be the appropriate 
weighted average of the signal components. Furthermore, the fact that a voter’s own log 
relative preference is informative ensures that, conditional on 7, wu; and 0; are correlated. 
In the introspection case, 6; = t;, and hence cov(6;, %;|7]| = €? = «?. Similarly, for the 
social-communication case, it is straightforward to observe that cov[d;, &; |] = €?/m = k?. 
In fact, 0;’s status as a sufficient statistic for 7 ensures that the covariance is always equal 


to the variance of the signal, as the following lemma confirms.?° 


Lemma 1. Conditional on n, a voter’s signal 6; and log relative-preference U; are positively 
correlated: cov|d;,t; || = &, yielding a corresponding correlation-coefficient of p = «/E. 


Following her observation of 6;, voter 7 updates her diffuse prior to form a posterior belief 
n| 0; ~ N(6;,67).°9 6; = E[n|6;] is voter 7’s expectation of the median voter’s identity and 
1/k? is the precision of her posterior beliefs. Importantly, different voters observe different 
signal realizations, and hence entertain different beliefs about the likely support for each of 


2 


the challengers; «* measures the variation in such opinions across the electorate. 


2.5. Signal Accuracy. A voter can always rely on introspection, and hence her signal 
precision will be at least 1/€?. Within the context of social-communication, the precision of 
a voter’s posterior beliefs satisfies 1/K? = m/€?: it increases linearly with m. This suggests 


8Proofs of any formal claims are contained in the appendices. 
29 Alternatively, if she begins with the prior 7 ~ N(j,0?), then Bayesian updating (DeGroot 1970) yields 


lii~ Nn (HAS ,) 


K2 +07 > K+ 0? 
Allowing o? — oo yields a posterior belief of | 6; ~ N(6;, &?). 
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that the accuracy of a voter’s perception of the electoral situation will grow with the size of 
her social network. If voters sample individuals who are similar to them, however, then the 
effective precision of their information (as measured by m) may be dramatically lower. 


Suppose, for instance, that individuals live within communities, and interact only with mem- 
bers of their own community, where a community is a small subset of the electorate. Any 
common cross-community idiosyncratic shock will then generate a tight lower bound to k?, 
since the sampling procedure cannot eliminate community effects. For instance, if 20% of 
cross-electorate preference variation is due to variation across communities, then it can be 


shown that «? > €?/5, or equivalently m < 5. I explore this idea in Section 6 below. 


Naturally, a voter’s information is not limited to social communication. A further source 
might be the publication of opinion polls or other media input.®° In many elections, however, 
these tend to occur at the national level, whereas candidates are elected at a regional level. 


At a regional level, opinion polls are less common.*! 


Nevertheless, the possibility of opinion polls must be taken seriously, and may still be viewed 
within the context of Assumption 2. For instance, if a poll were to perfectly identify 7, then 
«? would correspond to any (possibly small) noise in a voter’s observation of it. In fact, a 
signal 6; correctly identifies the leading challenger with probability a = ®(|n|/«), and hence 
a is the accuracy of a voter’s observation of the media.*? Allowing Kk? > 0 (or equivalently 
a — 1) generates the perfect observation of a perfectly revealing public information source. 


Returning to the social-communication micro-foundation described in Section 2.4, the notion 
of a signal’s accuracy may also be used, and a may be employed as a primitive parameter 
in the specification of the voter’s information. Since 7 = €®7'(7), and Kk? = €?/m, then 
a = 0(,/m|®~'(z)|), or equivalently m = [®~!(a)/~'(r)|?.. Suppose, for instance, that 
m = 0.6 and that voters can identify the leading challenger with accuracy a = 0.85. This 
corresponds to a value of m = 16.8, and hence is equivalent to the observation of a private 
sample of approximately 17 individuals. Put differently, a voter with a social network of 17 
people will identify the wrong candidate as the leading challenger with 15% probability. It is 
worth emphasizing that this statement continues to lean upon the assumption that a voter’s 
social network forms a representative sample of the wider anti-incumbent society. 


30McKelvey and Ordeshook (1985) explored a model in which voters base their decision on such sources. 
3lThe 1997 UK General Election provides an example. Evans, Curtice and Norris (1998) noted that 47 
nationwide opinion polls were conducted during the election campaign. By contrast, only 29 polls were 
conducted in 26 different constituencies at a constituency level, out of a total of 659 constituencies. 
32Suppose, for instance, that 7 > 0, so that candidate j = 1 is the leading challenger. The signal 5; correctly 
identifies him when 6; > 0. Thus the accuracy of the signal is a = Pr[d; > 0| 7] = ®(n/k). 
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3. OPTIMAL VOTING BEHAVIOR IN LARGE ELECTORATES 


In Section 4, I will consider a voting game in which the electorate of anti-incumbent voters is 
a unit mass. For such a game, payoffs are not immediately defined: an individual voter has 
no effect on the election outcome. To specify payoffs for such a game, I must first consider a 
voter’s optimal behavior in a finite electorate. Taking an appropriate limit as n — oo then 
yields an analytically tractable game played by a continuum of voters. 


3.1. Optimal Voting Behavior. Consider the decision of voter i = 0; on occasion, I will 
refer to her as the focal voter, and for notational simplicity I will drop the subscript 7 when 
I consider her decisions. She may only influence the election when she is pivotal. This 
happens if, absent her vote, there is a tie between a challenger and the disliked incumbent. 
To describe pivotal events, I use x to denote the total number of votes cast for candidate 1 
among the n other anti-incumbent voters.*® If 2 = x9 = [yn], then one more vote will allow 
candidate 1 to win. Similarly, ifn — x = x) = [yn| & x = [(1—)n| then candidate 2 is 
in the same position.*’ In both of these settings, i = 0 has a casting vote. Conditioning on 
any information available to her, she will consider the probabilities of the two pivotal events 


q = Prilz=[yn]] and ge = Prix = |(1-—y)n]], 
where I recall that 7 = [yn] is the (fixed) vote total for the disliked incumbent. A vote 
for candidate 1 will generate a payoff gain of u, with probability q,, and hence an expected 


payoff gain of qju,. Similarly, a vote for candidate 2 yields an expected payoff gain of qou2. 
These observations lead to the following simple lemma. 


Lemma 2. For voter i = 0 with payoffs uy and uz, an optimal voting rule must satisfy: 


g=1 if qius > q2Ue, 
vote for «7 =2 if qol2 > qu, and 


j=lorg=2 if guy = qtr. 


Notice that whenever a challenger enjoys strong support (that is, whenever x > [yn] or 
1—2 > | yn]) then a single vote has no effect. It follows that voter 7 = 0 has no interest in 
such events. For instance, a belief that candidate 1 is very likely to enjoy the vast amount of 
challenger support does not necessarily attract her vote: she is only interested in outcomes in 
which a candidate just reaches the required qualified-majority y from within the population 
of anti-incumbent voters. When this is possible for both candidates (so that min{q, q2} > 0) 
the following definition may be employed. 


Definition 1. The pivotal log likelihood-ratio or strategic incentive is A = log|q/q2]- 


33Hence, if the focal voter i = 0 votes for candidate 1, then 2} = «+1 and x9 =n-—-. 
34Here |y| indicates the greatest integer that is weakly less than y. 


14 


Using this definition, together with Lemma 2, voter i = 0 should vote for 7 = 1 whenever 


SS og |=] + tog | 2] >0 = 4Uut+A>Q0. 

U2q2 U2 q2 

She balances her relative preference for the challenging candidates (represented by w) against 
the relative likelihood that her vote influences the outcome of the election (represented by 
its logarithm, the strategic incentive A). When A = 0 (so that q,; = qe) the voter finds 
each pivotal event to be equally likely, and hence she faces no strategic incentive; she votes 


straightforwardly for her most preferred candidate. 


Lemma 2 describes optimal behavior contingent on a voter’s beliefs about pivotal events. She 
will use her signal and her expectation of the strategies used by others to form these beliefs. 
At this point I restrict attention to identity-independent strategies that are contingent only 
on payoff-relevant information: according to the following definition, a voter’s choice depends 


only upon the realization of %; and 6;, and not on the index i.®° 


Definition 2. An identity-independent voting strategy v(d;, i;) : R? + [0,1] is the proba- 
bility of a vote for candidate 1, contingent on the signal and preferences. This strategy is 
monotonic if it is (weakly) increasing in its arguments 6; and t;. It involves strict multi- 
candidate support if it takes both of the values 0 and 1 for appropriate choices of 6; and U;. 
Finally, it is fully coordinated if either v(6;, t;) = 0 for all 6;,u; or v(d;, &;) = 1 for all d;, t;. 


Notice that the specification of an identity-independent strategy yields a strategy profile for 
all voters. A monotonic strategy means that an increase in the preference and signal for 
a challenger cannot reduce the probability of a vote for that candidate. A voting strategy 
exhibits multi-candidate support if a voter supports each candidate with certainty, for an 
appropriate choice of signal and preferences. Finally, a fully coordinated voting-strategy 
involves a definite vote for one of the candidates independent of the signal and preference 


realization. 


3.2. No Electoral Uncertainty. What would happen if the common component 7 (and 
hence the identity of the median anti-incumbent voter, which also corresponds to the true 
underlying support a for the first challenger) were known? Restricting to the identity- 
independent strategies of Definition 2, votes are contingent solely on realized payoffs and 
signals. Conditional on 7, these payoffs and signals are independently distributed. It 
follows that, again conditional on 7, voting decisions are independent, and I may write 


35Tn this sense, strategy profiles that satisfy Definition 2 are symmetric. This does not, however, mean that 
all voters take the same action: a realized vote depends upon individual preference and signal realizations. 
A desire to maintain this distinction motivates the (slightly more awkward) identity-independent qualifier. 
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p = |v(6;, t) | n| for the (statistically independent) probability that a randomly selected in- 
dividual votes for candidate 1.°° The vote total x for candidate 1 among the n individuals 
i > 1 follows a binomial distribution with parameters p and n. The pivotal probability of a 
tie involving challenger 7 = 1 satisfies 


a=Prle= fan = (6, 


and similarly for gg: in a large electorate, the absolute probability of a pivotal outcome falls to 


ermal _ p)la-an] —0 a n-w, 


zero. As is well understood, however, it is not the absolute but rather the relative likelihood 
of pivotal events, captured by the strategic incentive A, that determines voting behavior. 
Taking the position of the voter i = 0, \ may be evaluated. Defining y,, = [yn] /n,°" 


(1 — pie ]” 1/2 
A= tog | 2] = oe |? em) | —{* Pa as n— ©. 
2 


p etl = py —oo p<1/2 


Unless p = 1/2, the size of the strategic incentive \ grows without bound as the electorate 
grows large.®® If voter i = 0 acts optimally, she will almost always vote for candidate 1 
whenever p > S. Extending this optimal response to other voters, there would be complete 
coordination, with the entire electorate strategically abandoning candidate 2. 


3.3. Uncertain Common Effect. But what if 7 (the median anti-incumbent voter’s iden- 
tity) is uncertain? Conditional on 7, voting decisions continue to be binomial with parame- 
ters p and n; but since p depends on 7, and 7 is unknown, it follows that p is unknown. 


I focus on monotonic strategies (Definition 2). If the strategy adopted by others is fully 
coordinated, then irrespective of 7, there is no possibility of a pivotal outcome.*? Con- 
sider monotonic strategies that exhibit multi-candidate support. It is immediate that p = 
E |v(6;, a) | 7)] is a continuous function of 7. Given the signal specification, 6 is a sufficient 
statistic for 7. I write f(p|6) for the density of the focal voter’s conditional beliefs over p. 


Lemma 3. Suppose that v(0;,U;) is monotonic and exhibits strict multi-candidate support. 
Defining p = E\v(6;, ti) | n], the density f(p|6) is continuous and strictly positive on (0,1). 


With this in hand, the pivotal probability of a tie involving candidate 7 = 1 becomes 
a ay 
a= (2,)ea—n) 7013) de, (1) 
0 \Ynt 


36Notice that p is the probability of a vote for candidate 1, whereas 7 = Prt; > 0| 1] is the probability that 
a voter’s first preference is for candidate 1. In the absence of any strategic voting, p = 7. 

37This notation helps me to deal with integer issues in a convenient way; note that y, — 7 as n — oo. 
38Employing the terminology of Myerson (2002), only one of the challenging candidates is “serious” whereas 
the other is “out of contention.” 

3°Thus q; and q2 are exactly zero, rather than zero in the limit as n — oo. In other words, if a voter expects 
all others to fully coordinate, then strategic considerations will offer no guide to her vote choice. 
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and similarly for q2. The binomial probability term in the integrand represents ¢dzosyncratic 
uncertainty: even with full knowledge of p, the decision of any individual is unknown, since it 
rests upon the realization of her signal and preferences. The second term f(p|6) represents 
electoral uncertainty: from the perspective of voter 7 = 0, the electorate-wide support for 
candidate 1 (represented by p) is unknown. For models in which the electoral situation is 
common knowledge, idiosyncratic uncertainty is present, but electoral uncertainty is not. 


As n — ov, the probability gq; vanishes to zero. Its asymptotic properties are interesting, 
however; they are recorded in the following result, which represents a minor extension of 
Proposition 1 from Chamberlain and Rothschild (1981).* 


Proposition 1. Under the conditions of Lemma 3, the pivotal probabilities satisfy 


lim (n+1)q =f(7/6), lim(n+l)m=f(l1—-7/6), and lim 2= PATO 


This is an application of the law of large numbers: in a large electorate, the relative likeli- 
hood of the two pivotal events is the relative likelihood that the challengers’ support levels 
(represented by p and 1 — p) coincide with the critical qualified-majority y. Importantly, 
then, it is only electoral uncertainty over p (generated by uncertainty over the common effect 
n) that matters; equivalently, when the electoral situation is not common knowledge, then 
idiosyncratic uncertainty plays no role. Why is this? As the electorate grows large, the id- 
iosyncrasies in payoff and signal realizations are averaged out. Uncertainty over the common 
component 7, however, cannot be averaged out and hence becomes the key determinant.*! 


An immediate implication is that the limiting strategic incentive \ is finite, and hence voter 
i = 0 may well find it optimal to vote for either candidate. Of course, this leaves open the 
possibility that strategic voting may be self-reinforcing, leading to an equilibrium outcome 
that involves full coordination of strategic voting—an issue that is addressed in Section 4. 


3.4. Voting in Large Electorates. My next step is to analyze a voting game in a large 
electorate (by allowing n — oo). Of course, if a continuum of voters is specified directly, 
then a single vote can have no effect. Equivalently, the probability of a pivotal outcome and 
hence the payoff response to any vote vanish to zero as the electorate grows large. I have 
observed, however, that it is not the absolute values of payoffs and probabilities that are 
of importance, but rather their relative size. In the finite-population voting-game, I may 


40Chamberlain and Rothschild (1981) were primarily concerned with the order of the probability of a tied 
outcome as n — oo; they noted that “[T]here have been to our knowledge few attempts to calculate the 
relevant probability. The calculations which we know of are neither general nor rigorous.” Their result was 
used by Mulligan and Hunter (2003) in their assessment of the empirical frequency of a pivotal vote. 
41This is not a new observation; Gelman, Katz, and Tuerlinckx (2002, p. 429) explained that idiosyncratic 
uncertainty is removed “because the binomial variation from which it derives is minor compared to any 
realistic variation among the probabilities.” In other words, only electoral uncertainty matters. They noted 
that this observation was also made by Good and Mayer (1975) and Margolis (1983). 
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re-scale the payoffs by multiplying through by the electorate size n+ 1. The payoff for a vote 
for candidate j becomes (n + 1)q;u;. Taking the limit as n — oo, the payoffs for a “limit 
game” (with a unit mass of voters) may then be defined. 


I restrict attention to the focal voter’s payoffs whenever all other voters i 4 0 adopt a mono- 
tonic voting strategy with multi-candidate support; this will be sufficient for the analysis of 
Section 4. I observe that (n+ 1)qi — f(y|6) and (n+ 1)q2 — f(1—7|6) as n — on, via 
Proposition 1. This suggests a payoff of u;f(y|6) in response to a vote for 7 = 1, anda 
corresponding payoff of u2f(1—7|6) in response to a vote for 7 = 2. 


This discussion leads me to specify the limit game in the following way. There are a contin- 
uum of anti-incumbent voters, indexed by i € [0,1]. Each voter i chooses a vote j € {1,2}, 
conditional on the realization of u; and 0;, which in turn follow the specifications described 
in Section 2. It remains for me to specify an individual voter’s payoffs. 


Assumption 3. Consider a limit game, with a unit mass of voters. When voters use a 
monotonic strategy with multi-candidate support (Definition 2) an individual’s payoffs satisfy 


U( Vote 1] v(6;, t:),0) = f(y |o)ur, and 
U( Vote 2| v(6;, &%;),6) = fA — | 6)us. 


In other circumstances, payoffs may be specified arbitrarily. 


3.5. The Wasted-Vote Phenomenon. Re-scaling the payoffs by the electorate size may 
be an appropriate specification of payoffs. A familiar critique of voting models is that all 
votes are wasted in a large electorate.” Proposition 1 confirms that q; — 0 at a rate of 
1/(n +1) as n — oo. An expansion of the electorate, however, may also change the 
size of the payoffs received by a voter in the event of a win by each of the challengers. If 
the electorate size n + 1 were to measure the importance of the election in the eyes of a 
voter, then (n + 1)u; might well be an appropriate specification of the payoff when j wins. 
In other words: voters may care more about more important elections. The presence of 
electoral uncertainty (in the form of the density f(p)) ensures that such scaled payoffs are 
well-defined in the limit. 


42Setting forth his views in the form of a dialogue, Meehl (1977) argued that “all economic theories of voter 
participation are radically incoherent, because such participation is irrational as an instrumental action 
toward an egocentric end.” Similarly, Palfrey and Rosenthal (1985) asked: “why does anyone bother to vote, 
given that voting is presumably costly and that the probability that one’s vote will affect the outcome is 
presumably small?” Responses to this question have been provided by Ferejohn and Fiorina (1974), Palfrey 
and Rosenthal (1983), Ledyard (1984), Morton (1991), Feddersen (1992), and Aldrich (1993), inter alios. 
43The observation that the probability of a tie, when p is unknown, is of order 1/n is due to Chamberlain 
and Rothschild (1981). When p is known, then the election outcome is realization of a binomial random 
variable. For p = 4, the probability of a tie is of order 1/\/n, and when p 4 4 it is of order exp(—cn) for 
some c > 0; see also Beck (1975) and Margolis (1977). 
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4. OPTIMAL VOTING AND EQUILIBRIUM 


I now focus exclusively on the limit game of Section 3.4, corresponding to n — oo, and 
envisage a unit mass of voters with payoffs as specified in Assumption 3. 


4.1. Linear Voting Strategies. I begin with the optimal behavior of an individual voter 
when all others adopt a monotonic voting strategy exhibiting multi-candidate support. 


Lemma 4. /f all other voters adopt a monotonic voting strategy exhibiting strict multi- 
candidate support (Definition 2), then the best response for an individual is to use a linear 


voting strategy: she votes for candidate 1 whenever u+a+bd > 0, for somea eR, bE Ry. 


Lemma 4 ensures that, so long as voters react positively to both their payoffs and their 
private information, then I may focus on the class of simple linear strategies. This is a 
consequence of the following reasoning: when members of the electorate employ a voting 
strategy that is positively related to signals and preferences, then there will be two values 7, 
and 72 < 7, of the common component that result in p = y and p = 1 — y respectively. An 
individual will, therefore, consider the log likelihood ratio of the events 7 = 7, and 7 = m2. 
Of course, posterior beliefs over 7 are normal with a mean of 6. Log likelihood ratios of the 
normal are linear in its mean. This leads to a strategic incentive that satisfies 
fy|9) 
A(d) = log PAuenl =a+t+bd 

for some a and b > 0. An individual voter finds it optimal to vote for candidate 1 whenever 
a+ (6) > 0,“ which is equivalent to i +a+4 bd > 0. 


The linearity of optimal voting strategies yields these implications: first, the class of identity- 
independent monotonic strategies exhibiting multi-candidate support is closed under best 
response, since the linear strategies of Lemma 4 are within this class; second, the class of 
linear strategies is closed under best response; third, any identity-independent monotonic 
voting equilibrium exhibiting multi-candidate support must involve linear strategies. 


To say more about the nature of optimal voting strategies, and the existence and nature of 
equilibrium strategies, requires the exact properties of the parameters a and b. 


Lemma 5. /f voters adopt a linear strategy v(6;,t;) = 1 = U;+a+ bd; > 0, then the best 
response for an individual is to adopt a linear strategy v(6;,U;) =1 4 w+at bd > 0, with 


ba wy 2071 (7) \/E2 + (8 + 2b)? 
ice K2(1 +b) (2) 


The slope parameter b(b) is decreasing from b(0) = 2€®-!(7)/k? to limy_.o¢ 6(b) = 2871(y) /k. 


Gas. b)= 


44Without loss of generality, I assume that she votes for candidate 1 when indifferent. 
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Density Over 
Median Voter 7 
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FIGURE 1. Explaining Negative Feedback 


Linear strategies are easily interpreted. The intercept a is the strategic incentive faced by 
a voter following a neutral signal 6 = 0. It is a signal-independent bias toward candidate 1 
(when a > 0) or candidate 2 (when a < 0). From Lemma 5 a voter will exhibit a bias (e.g. 
@ > 0) if and only if she expects other voters to exhibit such a bias (a > 0). 


In contrast, 6 represents the response of the strategic incentive to a voter’s private signal. 
The shape of b(b) shows how an individual’s use of her signal changes in response to increased 
(b +) or decreased (b |) responsiveness on the part of others; an issue to which I now turn. 


4.2. Strategic Voting and Negative Feedback. I have suggested that strategic voting 
may exhibit negative feedback. Since a = 0 in a stable voting equilibrium (I demonstrate 
this fact, just below) the appropriate vehicle for evaluating this hypothesis is b. 


Lemma 5 states that b(b) is decreasing: an increase in the tendency by others to vote 
strategically (b 7) reduces the tendency of an individual to do so (b |). Figure 1 helps me to 
explain this phenomenon. Suppose that all voters cast in favor of their true first preference 
(equivalent to a = 0 and b = 0) and that an individual receives a signal 6. Her posterior 
belief over the identity of the median voter 7) is distributed around 0, as illustrated by the 
density in Figure 1. The support from within the anti-incumbent electorate for candidate 1 
is p = Pr[n +e; > 0] = ®(n/€) S n = EO"! (p). This candidate will just reach the required 
qualified-majority y when p = 7, or equivalently 7 = n, = €®~1(y). Similarly, candidate 2 
will just reach the qualified majority when 7 = jn). = —€®~1(y). The focal voter will compare 
the relative likelihood of the events 7 = 7; and 7 = 7. From Figure 1, 7 = 7; is rather more 
likely, and hence she faces a large incentive to vote strategically for candidate 1. 
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Suppose instead that all voters are willing to vote strategically by responding strongly to 
their signals, so that b > 0. The support for candidate 1 just reaches y when 7 = ,, where 


Se eee ate (1+ d)in _ eid 
7 = Pi 2019 =a) =@ (ee *( ae) 


—1 2 1 (f2 2 
2, 4 Oe 


where the evaluation of the variance term employs Lemma 1. For the common component 


to reach 7, is a less extreme criterion than for it to reach 7,, since 
81) EF ETD 
a 1+ 


where the inequality follows from simple algebra and xk? < €?. By the same logic, candidate 


<€®"(y) =m, 


2 will just reach y when 7 = #2 = —7. An individual voter will assess the relative likelihood 
of 7 = m, versus 7 = 7. Clearly (Figure 1) these two values are much closer together. It 
is no longer the case that challenging candidate 1 is much more likely to contend for the 
qualified majority, and this reduces the incentive to vote strategically. 


Although negative feedback may seem surprising, it accords with intuition upon further 
reflection. When 0 is high, individuals respond strongly to their signals. This increases the 
probability of a strategic vote in both directions. A voter with signal 6 > 0 is concerned that 
others may observe signals 6; < 0, yielding a pivotal outcome involving candidate 2 rather 
than candidate 1. For high 6, this event seems most unlikely; surely candidate 1 will almost 
certainly win? But if candidate 1 will almost certainly win, then the vote of an individual 
has no effect; she can only influence the outcome when there is a tie. Now, if there were 
such a tie, then her strong signal must have overstated the support for candidate 1. She 
must therefore envisage a much lower true value for 7. It is then more reasonable for her to 
consider true values of the common component satisfying 7 < 0. 


4.3. Voting Equilibria with Partial Coordination. I seek monotonic equilibria exhibit- 
ing strict multi-candidate support. Following Lemma 4, these must involve linear strategies: 
individual 2 votes for candidate 1 if and only if u; + a+ bd; > 0. Of course, if all voters 
use such a strategy then it is a best response for a voter to use a linear strategy with coef- 
ficients G(a,b) and 6(b) (Lemma 5). For an equilibrium, I require a pair {a*, b*} satisfying 
a* = G(a*,b*) and b* = b(b*). The second equation involves only b and hence may be con- 
sidered independently. Lemma 7, contained in Appendix A, confirms that this equation has 
a unique solution, and obtains bounds for it. Turning attention back to a*, the fixed point 
equation a* = b*a*/(1 + b*) has a unique solution at a* = 0. 


Proposition 2. There is a unique monotonic voting equilibrium exhibiting multi-candidate 
support. It involves linear voting strategies and hence only partial strategic-voting. 
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Hence the partial coordination of strategic voting (implying multi-candidate support) is 
consistent with equilibrium behavior on the part of instrumental voters. 


Other observations are available. The fact that a* = 0 means that there can be no systematic 
bias toward one candidate: all strategic voting is driven by the response of voters to their 
informative signals of the electoral situation, as measured by b* > 0. Of course, this is a 
response to a voter’s signal of the electoral situation rather than the actual situation. For 
7 > 0, the realization of the signal for a particular voter 7 may well satisfy 6; < 0; this yields 
a strategic incentive for voter 7 to switch away from the more preferred option. It follows 
that strategic voting is bi-directional, with some voters switching in the wrong direction.” 

4.4. Equilibrium Selection. The partially coordinated equilibrium described in Proposi- 
tion 2 is unique in the class of monotonic equilibria with multi-candidate support. Moving 
beyond this class, there may be (depending critically on a fuller specification of payoffs 
beyond those made in Assumption 3) strictly Duvergerian equilibria: the entire electorate 
coordinates on a single candidate without reference to their payoff or signal realizations. In 
this case, the probability of a pivotal outcome is always zero and hence all instrumental voters 
will be exactly indifferent between the two candidates. Following a heavy exploitation of any 
such indifference, it is an equilibrium for them to all coordinate on a single candidate. Such 
strictly Duvergerian equilibria are, in the context of the present model, unsatisfactory and 
for a number of reasons. I highlight two critiques here, and consider a third in Section 4.5. 


My first critique is that the indifference of voters would allow non-instrumental motivations 
to dominate. For instance, suppose that voters were to be equipped with the following 
lexicographic preference structure: a voter’s primary motivation is instrumental, unless the 
instrumental incentives are zero, in which case she votes for her truly favorite candidate.*° 
This modification eliminates any strictly Duvergerian equilibrium. It has no effect, however, 
on the partially coordinated equilibrium described in Proposition 2.*” 


My second critique is that a strictly Duvergerian equilibrium requires some form of explicit 
coordination among voters. The source of such a coordination device is unclear. One possi- 
bility is that all voters perfectly observe a coordinating announcement by a single individual 


45 Anecdotally at least, this phenomenon was observed in the British General Election of 1997. The disliked 
Conservative incumbent Michael Howard stood for re-election in the Folkestone and Hythe constituency. 
Strategic voting was reputed to have occurred in both directions. Michael Howard retained his seat, polling 
39.0% of the vote. The left-wing parties split the anti-Conservative vote almost exactly: Labour 24.9% and 
Liberal Democrat 26.9%. Howard’s success in the 1997 election was important; at the time of writing he 
is the leader of Her Majesty’s Opposition. Thanks are due to Steve Fisher for this information, and Colin 
Thompson for his comments on this issue. 

46 Assumption 3 could be extended to specify U; = u; and U2 = u2 when all others coordinate. 

474 response to this critique might be that non-instrumental motivations would take over in the partially 
coordinated equilibrium, since the absolute probability of a pivotal event falls to zero with n. A retort to this 
response is the argument of Section 3.5: instrumental payoffs may scale up with the size of the electorate, 
offsetting the fall in the absolute probability of a pivotal outcome. 
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or organization. But how could voters be sure that they are all seeing the same announce- 
ment? More subtly, even if all voters see the same coordinating announcement, and moreover 
see that others are doing the same, this does not necessarily mean that there is common 
knowledge of such an announcement.*® In fact, an instrumental voter will always be inter- 
ested in exactly those circumstances in which the announcement fails, since it is only in such 
circumstances that her vote will have any effect. This leads consideration back to equilibria 
in which voting behavior is responsive to signals and preferences. 


4.5. Stability. A further justification for the partially coordinated equilibrium of Proposi- 
tion 2 is this: an appropriately specified strategy-revision process converges to it. Specifically: 
beginning from any monotonic strategy with strict multi-candidate support, a sequence of 
updated best-responses will always lead back to the partially coordinated equilibrium. 


To understand this idea, begin with the initial hypothesis that all individuals vote straight- 
forwardly for their preferred candidate. This is equivalent to employing a linear strategy 
with ag = bp = 0. An individual, acting optimally in response, will employ a linear strategy 
with parameters a, = 4(0,0) and b, = b(0). Of course, she may well anticipate a similar re- 
sponse in the wider electorate, and hence update once more to obtain another linear strategy 
ag = (a1, b,) and by = b(b;). This thought experiment describes an iterative best-response 
process within the class of linear voting strategies.*? Of course, a starting point within the 
class of monotonic voting strategies with strict multi-candidate support will enter this class 
within one step (Lemma 4). Formally, I may define an iterative best-response process by 
i= b(b:-1), ay = G(a4-1, 0:1). Global stability may be ascertained.°” The mapping b(b) and 
associated process {b;} are not contingent on a;, and hence may be considered in isolation. 
Lemma 8 in Appendix A confirms that the process {b;} follows a sequence of dampened 
cycles, and converges to b*. By inspection, a; = @(a:_1, b:-1) converges to a* = 0. 


48Common knowledge is an extreme form of public identification: voters “...desert the (publicly identified) 
trailing candidates in order to focus on the (publicly identified) front-running candidates.” (Cox 1997, p. 78) 
The recent literature on global games (Morris and Shin 2003, Myatt, Shin, and Wallace 2002) demonstrates 
that the lifting of common knowledge assumptions has a dramatic effect on the nature of equilibria. Cox 
(1997) compared the present scenario to the classic “Battle of the Sexes” game, with two coordinated 
pure strategy and a single mixed strategy Nash equilibrium. In the present context, these correspond to 
Duvergerian and non-Duvergerian equilibrium. As the seminal contribution of Carlsson and van Damme 
(1993) demonstrates, a slight weakening of the common knowledge assumption in Battle of the Sexes results 
in a unique Bayesian Nash equilibrium. The same is true here. 

49Farlier work by Fey (1997) considered a process of repeated elections, beginning with an election in which 
agents act truthfully. I view the iterative best-response process described here as a thought experiment prior 
to the act of voting, and hence multiple elections are unnecessary. 

50Of course, the iterative best-response process cannot begin at a fully coordinated voting strategy profile; 
facing such a strategy, a voter faces no instrumental incentives. Assuming that voters retained the initially 
postulated strategy profile when this is so would ensure that the process would not move away from a fully 
Duvergerian equilibrium. On the other hand, the process does not move towards such an equilibrium either; 
moreover, a fully Duvergerian starting point would, once again, be crucially dependent upon the exact 
common knowledge of the initial conditions and the absolute absence of any non-instrumental concerns. 
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Proposition 3. From any monotonic strategy with multi-candidate support, an iterative 
best-response process converges to the globally stable partially coordinated equilibrium. 


Hence a thought process by which a voter iteratively assesses the likely behavior of others 
leads to a partially coordinated equilibrium. In particular, this is true when a voter begins 
with the initial hypothesis that others will act truthfully (ao = bp) = 0). No explicit coordi- 
nation device is necessary: iterative reasoning alone allows an instrumental voter to reach a 


partially (but not fully) coordinated equilibrium voting strategy. 


5. COMPARATIVE STATICS 


The strictly Duvergerian equilibria described by Palfrey (1989) and Myerson and Weber 
(1993) cannot be the subject of comparative-static exercises. Here, however, the partially 
coordinated equilibrium described in Proposition 2 is amenable to such analysis. 


5.1. Strategic Response. A voter’s willingness to vote strategically is captured by 6, her 
response to her private signal of the electoral situation. In the partially coordinated equi- 
librium of Proposition 2, the solution b* to the fixed-point equation b = b(b) describes this 
response. As a benchmark, I also consider b(0). This represents the response of a voter who 


5! To ascertain the re- 


acts strategically, but expects all others to vote straightforwardly. 
sponse of b* and 6(0) to parameter changes, I consider the mapping b(b). If 6(b) is increasing 
in an exogenous parameter for all b, then (since b is decreasing in b) so must be the fixed 


point b* as well as b(0) = 207! (y)E/n?. 


Examining Equation (2), it is immediate that b(b) is increasing in y: a voter responds 
more strongly to her signal when greater coordination is required. ‘There is a sense in 
which y represents the “safety” of a disliked incumbent office-holder. For instance, when 
y — 1 full anti-incumbent coordination is needed. This may run against initial intuition: for 
English parliamentary constituencies, it means that relatively safe Tory seats will encourage 
anti-Conservative voters to coordinate their actions. Of course, a voter cares only about 
influencing the outcome, and realizes that, for large y, coordination behind the leading 
candidate is crucial. Inspection of Equation (2) also reveals that 6(b) is decreasing in K?. 


Lemma 6. A voter’s strategic-incentive response to her private signal is increasing in the 
required qualified-majority y and the precision of the informative signal 1/k?. 


Equation (2) also suggests that b(b) is decreasing in the idiosyncrasy of the electorate €. 
This comparative static may be misleading, however. Fixing the proportion 7 = ®(7/€) 
who prefer the first challenger, the median voter (equivalently, the common component of 
Assumption 1) 77 satisfies 7 = €®~1(7). But this means that, for fixed 7, |n|, and hence the 


>lRecall that voting straightforwardly is equivalent to a linear strategy with a = b = 0. 
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size of the signal 6 that a voter is likely to receive, are increasing in €. When the electorate is 
relatively idiosyncratic (high €?) then voters respond more sluggishly to their private signals; 
however, this is offset by the increased size of signals received. 


5.2. Strategic Incentives. To circumvent this issue, consider the actual strategic incentive 
A = bé faced by a voter. The average (both mean and median) incentive is bDE[é] = bn = 
€®-!(z). Suppose that candidate 7 = 1 is the leading challenger (7 > 1/2). Clearly, 
the typical strategic incentive is higher when candidate 1’s position is relatively stronger. 
Turning back to €?, I must also address the fact that the information precision 1/K? may be 
influenced by preference idiosyncrasy. 


Recall the social-communication micro-foundation of Section 2.4 where a voter observes a 
sample of m log relative-preferences (including her own) from the electorate. This generated 


kK? = €?/m, and in this environment m represents the information available to voters. Then 


(0) = 20- wee + 26) 3. Bij en (3) 


It is easiest to consider the decision-theoretic case; that is, the world in which a strategic 


voter conjectures that all others will vote straightforwardly. b = b(0), and hence the average 
strategic-incentive is 

_ 207" (y)m 
ear ae 


In this decision-theoretic world the net of effect of idiosyncrasy on strategic incentives is 


E[A] = 6(0)n x £@'(m) = Imo") (mn), (4) 


zero, and the average strategic incentive depends in a simple way on the required qualified- 
majority 7, the relative strength of the leading challenger 7, and the sample size m. 


Turning attention back to b*, idiosyncrasy continues to have an effect: increased idiosyncrasy 
tends to increase incentives to vote strategically (Proposition 4, just below). There is an 
additional factor present. Increased idiosyncrasy means that other voters are (holding their 
strategic-voting incentives constant) less likely to vote strategically: there are fewer relatively 
indifferent voters. In a game-theoretic world, the caution inherent in optimal voting behavior 
(see Section 4.2) is reduced. Hence the incentive to vote strategically is greater. 


Proposition 4. Strategic voting incentives increase with the qualified majority y, the rela- 
tive strength of the leading challenger x and the signal precision m. In the uniquely stable 
equilibrium strategic incentives increase with the idiosyncrasy of the electorate €?. Also, 


b(0)n eel -1 an b*n ola =i | (Oat) are? 
AE = 20M)" and im TE = 6-1) 224] aa 


The final claim of Proposition 4 reveals the rate at which incentives change when voters are 
faced with more information. The interesting feature is this: when voters are game-theoretic 
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(that is, they anticipate strategic voting by others, rather than assuming that others vote 
straightforwardly) they react much more slowly to increased information. In fact, since 
strategic-voting incentives are increasing (at least asymptotically) with \/m, this means that 
the marginal effect of increased information declines to zero as m — oo: if (game-theoretic) 
voters are quite well informed, then any additional information sources have little effect. 


5.3. The Precision of Information and Duverger’s Law. Allowing m — oo generates 
a benchmark case where voters are almost perfectly informed. Almost everyone receives a 
signal close to 7, and thus the strategic incentives faced by voters increase without bound 
as m — oo. Hence (for 7 > 1/2) almost all voters successfully coordinate on candidate 1. 
Put simply, this means that the uniquely stable partially coordinated equilibrium is almost 
perfectly Duvergerian when m — oo; a strict version of Duverger’s law (almost) holds when 
voters have (almost) perfect knowledge of the electoral situation. 


Notice that only one fully Duvergerian outcome remains as m — oo.” Interestingly, the case 
of m — oo essentially retains the assumption that voters know the electoral situation, but 
removes the assumption that this is common knowledge. Thus the removal of the common- 
knowledge element that underpins existing models eliminates all but one equilibrium. 


Importantly, the partially coordinated outcome, with multi-candidate support, is not related 
in any way to the non-Duvergerian equilibria of common-knowledge voting games: such 
equilibria require the exact knowledge of underlying candidate-support levels. To see why, 
notice that if the voting strategy generated p # 5 then an infinite strategic-voting incentive 
would appear as n — oo. To avoid this, and hence generate multi-candidate support, p needs 


to be close (arbitrarily close in a large electorate) to 5. Thus a precisely calculated proportion 
1 
2? 
1 as most preferred) must switch away from this leader and toward the trailing challenger. 


of the voters who prefer the leading challenger (so for 7 > 5, voters who view candidate 
Specifically, if a fraction 7 — $ voters do so, then a tie between the challengers (p = 3) will 
result in the attenuation of any strategic voting incentives.°? This feature leads to explicit 
mis-coordination: some voters switch away from a leading challenger in order to stop the 
strategic incentives that would lead to the successful defeat of the disliked status-quo. This 
is especially perplexing when it is understood that such non-Duvergerian equilibria require 
precise common knowledge of the electoral situation: 7 or equivalently 7 must be commonly 


52The iterative-deletion of weakly dominated strategies employed by Dhillon and Lockwood (2004) does not 
eliminate equilibria in a common-knowledge version of the qualified-majority voting game. 

53In fact, for the case 7 > 4 a slightly greater proportion are required to switch in order to generate a (finite) 
strategic incentive to switch away from candidate 1. 
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known by all voters.°* These “knife-edge” properties were recognized by Palfrey (1989), who 


discounted non-Duvegerian equilibria.®”’° 


5.4. The Impact of Strategic Voting. Changes in €? influence the strategic incentive 
faced by voters, at least in the game-theoretic case. They also influence voter preferences, 
however, and hence potentially determine a voter’s commitment to her most preferred can- 
didate. To assess the impact of strategic voting, I must consider both effects. 


To measure this impact, I consider the probability that an individual in an appropriate risk- 
population votes strategically.’ By “risk population” I mean those individuals for whom a 
strategic vote makes sense. Suppose, that 7 > 1/2, or equivalently 7 > 0, so that candidate 
1 is the leading challenger. A voter who prefers candidate 2 (so that tu; < 0) but believes 
that candidate 1 is the best-placed challenger (so that 6; > 0, which implies X = bd; > 0) 
faces a positive strategic incentive and is “at risk” of voting strategically. 


An easy way to study an at-risk individual is to equip her with a signal satisfying 6; = 7 
(so that she has an accurate expectation of the electoral situation; of course, she does not 
know that this the case) and then examine her behavior. She votes for candidate 1 whenever 
bd; + u; > 0, or equivalently (given that 6; = 7) when (1+ 6)n + ¢; > 0. Hence the realized 
support for candidate 1 is’® 


p= (oe) = 6((1+ 6)6"1(z)). (5) 


Of course, this voter actually prefers candidate 1 with probability 7, and so a strategic vote 
is observed with probability p — a. Furthermore, she is at risk of voting strategically when 
she prefers candidate 2 (with probability 1 — 7). Hence, the impact of strategic voting, 


>4The exact amount of “wrong-way switching” that is required is sensitive to n; non-Duvergerian equilibria 
require common knowledge of the exact electorate size n. 

>°It is perhaps unsurprising, therefore, that Fey’s (1997) repeated-election dynamic diverges away from 
non-Duvergerian equilibria. Interestingly, his dynamic is rather different from the one described here. In 
particular, it is path dependent: the limiting Duvergerian equilibrium to which it converges depends on the 
starting point. Repeated elections are not necessary to employ the iterative best-response process of this 
paper. This can be simply regarded as a “ficticious play” thought process in the mind of a player. 

a Glen (1994, p. 609) noted that non-Duvergerian equilibria “are not unusual in the mathematical sense of 
being non-generic.” Here, the unique partially coordinated equilibrium involves strategic switching toward 
the leading candidate. It cannot, therefore, exhibit a tie between the challengers. This continues to be 
true in the limit as .? — 0 and all uncertainty is removed. It follows that non-Duvergerian equilibria are 
non-generic: they only exist when K? is exactly equal to zero. 

57Using a stripped-down variant of this model, Myatt and Fisher (2002) performed a similar exercise. 
58Here I have actually given her an artificial signal 6; = n, and so e; ~ N (0,€7). If I had considered 
her behavior conditional on her actually receiving a signal 6;, then the distribution of ¢; changes slightly. 
The reason is that ti; and hence ¢; is a component of the signal 6;, and so knowledge of 6; influences the 
(conditional) distribution of ¢;. Correcting for this has a minimal effect on the analysis (Appendix B). 
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measured as a proportion of the risk population, is 


pat. &((1 + b)®-1(7)) et (6) 


1-7 l-7 


Inspection of this expression, coupled with earlier results, yields the following proposition. 


Proposition 5. In the uniquely stable equilibrium, the probability that an at-risk individual 
votes strategically increases with the required qualified majority, the relative strength of the 
leading challenger, and the information available to voters; but decreases with idiosyncrasy. 


5.5. Distance-from-Contention and Marginality. A common intuition suggests that 
strategic voting should be greater in marginal elections, and when the preferred candidate is 
far from contention. The marginality hypothesis is problematic. The idea behind it is that 
pivotal events are more likely in more marginal elections.°® This idea has no role in a theory 
of purely instrumental voting, however, since it is the relative probability of different pivotal 
events that matters and not the absolute probabilities. 


Here, the electoral situation is summarized by y and 7. These parameters are not an ideal way 
to separate the effects of marginality and distance from contention. Suppose, for instance, 
that y > a > 1/2. An increase in 7 (associated with higher strategic-voting incentives) 
will increase the gap between the two challengers and hence the distance from contention of 
candidate 2. Unfortunately this parameter change also closes the gap between candidate 1 
and the required qualified-majority, making the election more marginal; it is unclear whether 
increased strategic-voting is due to marginality or the distance from contention. 


A different representation of the electoral situation is required. To do this, I expand con- 
sideration to all members of the wider electorate, including those who vote (exogenously) 
for the disliked 7 = 0. I write the true underlying support for these three candidates as 
Wo, Wy and wo where wo + wy + W2 = 1. This scenario may be easily mapped back into a 
qualified-majority voting game: the qualified majority of anti-incumbent voters required to 
defeat 7 = 0 is y = Wo/(1 — Wo); the true underlying support for candidate 1 among the 
anti-incumbent electorate is 7 = w/(1 — v9). 


Using this notation, the ideas of marginality and distance from contention may be formalized. 
Consider an election in which at least some strategic voting is needed to defeat 7 = 0, so 
that Wo > 1 > We, or equivalently y > 7. Then I may define 


Winning Distance from 


=w=W%o-— v1 and =d=W1-— 2. 


Margin Contention 


59Cain (1978, p. 644) provided a classic example of this hypothesis in his analysis of strategic voting in 
Britain: he expected the pressure to defect to be lower in “noncompetitive” constituencies. 
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These parameters (together with w+ W, + V2 = 1) are sufficient to determine the electoral 
situation. Solving linearly to obtain wo, Ww, and we. in terms of w and d 


Wo = (1+ 2w+d)/3 
ti =(1-—w+d)/3 id 
2 = (1-w — 2d)/3 


2, bate 2s ee _ l—-wed 
Oe a a) ee 


Using this formulation, I may change w and d independently, and determine the effect on 


y and z. By inspection, both y and 7 are increasing in d, and so (as suggested by casual 


intuition) an increase in the distance from contention results in greater strategic incentives. 


The required qualified-majority y is clearly increasing in the winning margin w. Simple 
algebra confirms that 7 is also increasing in w. Taking these two responses to w, I reach 
the following conclusion: fixing the distance from contention, an increase in the size of the 
winning margin w (making the election less marginal) actually increases the incentives to 


vote strategically. 


Proposition 6. When coordination is required to defeat 7 =0 (Wo > W1 > 2) the incentive 
to vote strategically increases with both the winning margin and the distance from contention. 


This prediction runs against established intuition. After controlling for the distance from 


contention, strategic voting should be lower in more marginal elections.°! 


5.6. The Importance of Strategic Voting and Election Inversion. Equation (6) mea- 
sures the impact of strategic voting. In a similar way, I may assess the zmportance of strategic 
voting. An inversion of Equation (5) would yield z as a function of p: 7 = ®(®~!(p)/(1+))). 
This reveals what the election outcome would have been in the absence of strategic voting. If 
p>vy> 1, then strategic voting changes the identity of the winning candidate. For a more 
accurate assessment of this effect, I must also account for the variation in voter’s signals. 


60Empirically the distance from contention is well known as a strong predictor of strategic voting, allowing 
the measure to become a basis to construct validity tests for different measures of strategic voting (Franklin, 
Niemi and Whitten 1992, 1993, 1994 and Evans and Heath 1993, 1994). 

To adjudicate between the common intuition of Cain (1978) and others, and the comparative static results 
presented here, empirical analysis is required. In recent work, Fisher (2000) found that strategic voting 
increases with the winning margin in UK General Elections. The effect is weak, but nevertheless is significant 
at the 10% level when estimated across the three elections of 1987, 1992 and 1997. This analysis was extended 
by Myatt and Fisher (2003) and Fisher (2001b). In the former paper, a version of the strategic incentive A 
was shown to explain the pattern of strategic voting across all three elections, and its inclusion eliminated the 
significance of the winning margin and distance from contention as separate measures. In the latter paper, 
Fisher (2001b) included a range of other explanatory variables, including political interest, education, party 
identification, and local campaign-spending. A version of \ continued to have strong explanatory power. 
This means that there is strong empirical support for the present theory relative to informal intuition. 
Interestingly, this provides a partial response to the critique of rational-choice theory offered by Green and 
Shapiro (1994): the formal theory offers explanations that differ from intuition, and yet better fit the data. 
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Straightforward manipulations lead to 


<2 (1 + b)n _ “lie m 
pao (on) -0(a+ne OfearatR) 
m r=0 (TO (ovate), 7) 


1+5b m 


Equation (7) allows me to “invert” an election outcome (that is, I remove the effect of 
strategic voting) and requires as input the parameters y, p and m. The parameters y and 
p summarize the realized electoral situation.©? In the calibration exercise of Section 5.7, I 
examine the response of strategic voting to the information parameter m. Before doing this, 
however, I must consider the specification of €?. 


€? measures the idiosyncrasy of preferences across the electorate. An increase in €? results 
in a larger number of voters who are more heavily committed to their preferred candidate 
(for instance, when uj;1/ujz is large). As I show in Appendix B, a value for €? may be pinned 
down by the preferences of the median supporter of a particular challenging candidate. 


To understand this idea, consider the median supporter of candidate 1, and suppose that 
she prefers candidate 1 twice as much as candidate 2, so that uj = 2u2. Since she is the 
median supporter, it follows that Prlu; > 2u2] = 2/2. Together with 7, this equation is 
sufficient solve for €?. In fact, for this example and 7 = 1/2, it solves for €? = 1.056 (see the 
Appendix). I thus fix €? at this value for the calibration exercises of Sections 5.7 and 6.3. 


5.7. Application: New York (1970). I am now in a position to calibrate the model 
and examine the impact and importance of strategic voting in the context of reasonable 
parameter values. I return to the case of the 1970 New York senatorial election (Table 1). 
Mapping this example into the model yields the parameters y and p, 

Wo 2, 288, 190 V1 2,171, 232 
~ dy +2 3,605, 704 Wit, 3,605,704 
Hence liberal voters needed to achieve a qualified majority of y = 0.635 to defeat the disliked 


= 0.602. 


= 0.635. and) p= 


“y 


Buckley and yet achieved only p = 0.602; this split in the liberal vote allowed Buckley to 
win. Following the analysis above, I fix the idiosyncrasy parameter at €? = 1.056. For any 
specified m, I may now calculate the equilibrium response b of a voter to her signal 6;. For 
a variety of values for 7, I calculate the impact of strategic voting, by substituting into 
Equation (6). The results are displayed in Figure 2(a). As expected, the impact of strategic 
voting on the risk population increases with both m and a. For small m and 7, strategic 
voting is limited. However, it takes only moderate values of m to dramatically increase the 


62This contrasts with the true underlying electoral situation, which relies on 7 rather than p. 
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FIGURE 2. Strategic Voting in the New York Senatorial Election 


equilibrium impact of strategic voting to (when 7 is relatively large) very high levels. Are 
such large values for m reasonable? 


Adopting the social-communication micro-foundation for the informative signal 6; described 
in Section 2.4, it would seem that moderately large values for m are appropriate: an indi- 
vidual voter might be expected to interact with a large number of people during day-to-day 
life. This may be misleading, however. Even if the number of individuals observed is large, 
the effective value for m will be rather lower. I offer two reasons here. First, the observation 
of the preferences of others is likely to occur with noise. Second, and most importantly, if 
a voter interacts with individuals whose preferences are correlated, then the observation of 
additional voters is likely to add little extra information; I address this second issue formally 
in Section 6. To cope with these issues here, I may check the plausibility of m by computing 
the accuracy of beliefs. 


As in Section 2.5 I define the accuracy of beliefs as the probability that the informative signal 
correctly identifies the leading challenger. I continue to assume (without loss of generality) 
that 7 > 0, so that a > s. A signal indicates the correct leader when 6; > 0. Recall that this 
occurs with probability a = Pr[d; > 0] = ®((./m/E)n) = ®(./m®!(7)). This last formula 
allows me to cross-check m against the accuracy of beliefs. For instance, 


0.599 if m= 0.52, 


m=2 => a=6(/mO'(r))=¢ 0.692 ifm =0.54, and 
0.775 if t = 0.56, 
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with further values illustrated in Figure 2(b). This calibration exercise demonstrates that if 
the outcome of the New York senatorial election is to be consistent with the present model 
then either the true value for 7 must be relatively close to s, or the information available to 
voters must not identify the challenger with very high probability. 


In fact, I may calculate the parameter 7 that is consistent with the data by using Equa- 
tion (7). Doing so with m = 25 yields 7 = 0.528. This suggests that, absent strategic 
voting, Ottinger’s true support may have been closer to 0.528 x 61% = 32.2%. Of course, 
the value of m = 25 implies that voters’ beliefs were not particularly accurate, and hence 
this “inversion” procedure is subject to the critique that an instrumental theory requires 
voters to have relatively “fuzzy” beliefs. I turn attention to this issue. 


6. COMMUNITY SAMPLING AND FUZZY BELIEFS 


6.1. Social Communication within Communities. The calibration exercise for the New 
York election suggests that the present theory is consistent with observed data only when 
the information available to voters is limited. If I were to introduce the possibility of opinion 
polls, then the probability that a voter will be able to identify the lead challenger will be 
relatively large. In such situations, therefore, the theory may well predict too much strategic 


voting, leading me to question the assumption of instrumental motivations. 


Turning to other scenarios, however, this is not necessarily the case. The plurality rule is 
used in the parliamentary constituencies of the United Kingdom. At a constituency level, 
opinion polls are not typically available (see Footnote 31), and hence a voter must learn of the 
electoral situation via social communication. If 1 am to take seriously the sampling procedure 
described in Section 2.4, then I must also recognize that voters are likely to communicate 
with other individuals who are similar to them. 


I formalize this idea with a notion of community communication. Suppose that each individ- 
ual belongs to a community, and that a fraction w of her idiosyncratic component is shared 
with all individuals in the community. This is modelled via the following decomposition: 


var|9 | 7] =wé?, and 


&=0+8 => t;=n+0+8,;, where . 
varé; |] = (1 —w)€?. 


Thus preferences are equal to a common component across the electorate, plus a community 


1.°3 The parameter 


component and then a further individual component; a multi-level mode 
w represents the relative importance of the community effect. This specification may seem 
reasonable; after all, the preferences of the electorate may well be affected by events at 
a community level. Now, suppose that a voter is only able to sample the preferences of 
634 related specification for a model of district-based electoral systems was proposed by Gelman and King 


(1994); such a hierarchical probit structure was also used by Gelman, Katz, and Tuerlinckx (2002, pp. 431-33) 
in their study of power indices. 
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FIGURE 3. The Effect of Community Variation 


individuals from her own community. Since @ is common to every sample member, it cannot 
be averaged out. This means that var[d;|7| > var[@|7]. In fact, the best that a voter can 
do is to identify perfectly the typical member of her community. This would be equivalent 
to the observation of 7 + @, and hence a signal 6; = + @. In this case, 

var[d;|7] 1 

=a =e 

It follows that there is an upper bound to the precision of information. For instance, if 
w = 0.2 so that 20% of individual variation is due to variation across communities, then 


var[6; |] = var[O |] =we? > m= 


m <5. This suggests that appropriate values for m might well be very low indeed. In 
Figure 3(a) I plot the impact of strategic voting against w. By inspection, even a small 
element of community variation dramatically reduces the impact of strategic voting. 


6.2. The False-Consensus Effect. The presence of community variation has additional 
implications for the pattern of correlation between a voter’s preferences and her beliefs. 
Suppose that candidate 1 is the true leading challenger (7 > 0) but that a particular voter 
i prefers candidate 2, so that u; < 0. When w is large, so that the individual variation is 
small relative to community variation, then it is highly likely that voter 7 is drawn from a 
community where 7 + 6 < 0. If her opinions are based on the observation of her community 
(so that 6; = 7+ as above) then she will believe that her favored candidate 2 is the leading 
challenger. In other words, for large w, voters are likely to live in communities where their 
opinions are shared, and hence any information on the electoral situation will be highly 
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correlated with their own preferences. To express this formally, I may examine 


, ol i zw — O1(7) 
Pr|d; <0')4;<0) = 7 ae ( fla ) d®(z). (8) 


This is the probability that a voter’s signal indicates that candidate 2 is the lead challenger 


given that she prefers candidate 2. The equality is derived in Appendix B. 


I plot this probability against w in Figure 3(b). Since I have, without loss of generality, 
specified 7 > 1/2 = 7 > 0, this is equivalent to the probability that a supporter of the trailing 
challenger is mistaken in her belief of this candidate’s likely prominence. By inspection, for 
large w this probability is quite high, and importantly it exceeds ‘. 


I make two observations. First, this analysis shows that an exaggerated belief (referred to as 
the “false-consensus effect” in the psychology literature) in the likely success of a preferred 
candidate is in no way irrational.®* In fact, voters form their opinions in a way that is entirely 
rational; any “exaggerated belief” is merely a consequence of the fact that preferences and 
information are highly correlated. Second, it suggests that if community variation is large 
then one might expect to see a large incidence of apparently (but not actually) misplaced 
confidence in preferred candidates. 


6.3. Application: United Kingdom (1997). The fully Duvergerian equilibria of a “com- 
mon knowledge” plurality-voting game predict the strict bipartism at the level of election 
district; all voters coordinate behind two leading candidates. The 1970 New York election 
provides a counter-example to both strict bipartism and the non-Duvergerian equilibria em- 
phasized by Cox (1994). The data’s rejection of such common-knowledge equilibria is more 
widespread; I return to the UK General Election of 1997 for a further illustration. In 1997, 
the three major political parties (Conservative, Labour and Liberal Democrat) competed 
throughout 527 English parliamentary constituencies. In Figure 4, I use barycentric coor- 
dinates to plot their relative vote shares: both strictly Duvergerian and non-Duvergerian 
equilibria are absent. 


Using the idea of community communication, I now consider calibrations of the model with 
rather less information, corresponding to a lower values of m. Naturally, lower values of m 
lead to lower accuracy, and hence voters find it harder to identify the true leading challenger 
to the disliked incumbent. To take an example, suppose that w = 1/5, or equivalently m = 5. 


®4Goeree and Grosser (2003) attributed the “false consensus” terminology to Ross, Greene, and House 
(1977), and credited Dawes (1990) with the observation that is not justified automatically. They noted that 
“lwlhen someone’s decision is driven by their taste, which is seen as a draw from a population, then it is 
perfectly rational to use this decision the same way as any other random sample of size one. Only when 
information about one’s own taste is overweighed [sic] is the perceived consensus false.” Notice that the false- 
consensus effect differs from the confirmatory-bias modelled by Rabin and Schrag (1999); they considered 
over-confidence in hypotheses rather than a correlation between an individual’s preferences and her beliefs. 
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Conservative 


T Certain Tory Win 
| Coordination Required 


| Certain Tory Loss 


Labour Liberal Democrat 


The relative vote shares for the three major parties for 527 English constituencies 
are plotted using a barycentric or simplex plot. The three corners of the simplex 
represent 100% vote shares for the labelled party. A bullet point “e” indicates a 
constituency. Its location on the graph is a weighted average of the three extreme 
points, with weights corresponding to the relative vote shares of the major parties. 
The sides of the simplex (solid lines) represent points where only two parties re- 
ceive votes. The hatched lines represent points where two parties tie for the lead, 
and hence separate the “win zones” for each party. The dotted lines delineate the 
constituencies in which the Conservative party polled between 1/3 and 1/2 of the 
votes for major parties. These are the 270 constituencies in which coordination of 


anti-Conservative voters is required to avoid a Tory win. 


FIGURE 4. British General Election 1997 


If 7 = 0.635, then a voter correctly identifies the leading challenger with probability 78%, 
and hence 22% of the electorate are mistaken in their perception of the leading challenger. 


Does this resonate with empirical observation? To assess this, I may turn to the 1997 British 
Election Survey. Fisher (2000, p. 6) commented that 68.5% of those voters who expected 
their preferred party to come second actually found that it came third.® This suggests that 
British voters may have rather inaccurate opinions of their constituency’s characteristics, 
and hence the electoral situation. More interestingly, he found that roughly half of those 


65This was an empirical instance of the false-consensus effect. Brown (1982) and Baker, Koestner, Worren, 
Losier, and Vallerand (1995) reported similar findings in US and Canadian elections. 
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Actual | w = 0.05 |w=0.1 |) w=0.2;|w=0.3)/w=04)w=0.5 
Lab > Con > _ Lib 118 57 70 82 105 105 105 
Con > Lab > Lib 73 134 121 109 86 86 86 
Con > Lib > Lab 54 70 68 65 65 63 59 
Lib > Con > Lab 25 9 11 14 14 16 20 
Con — Lab — 61 48 36 13 13 13 
Con — Lib a 16 14 11 11 9 3) 
Con Seats Lost — 77 62 A7 24 22 18 
Impact — 35.8% | 31.9% | 26.3% | 21.9% | 18.1% | 14.6% 


The first four rows of the table classify the 270 seats according to the relative 
vote shares of the three main parties. I have excluded constituencies in which the 
Conservative party came third, hence there are four possible configurations. The 
first column gives the classification according to the actual results. The remaining 
columns give the results when the strategic voting element is removed, for values 
of w from 0.05 to 0.5 (and hence m from 2 to 20). The fifth, sixth, and seventh 
rows give the number of seats lost by the Conservatives due to strategic switching. 
The final row gives the probability of a strategic vote by a typical member of the 
risk-population; that is, someone who prefers the trailing challenger and is equipped 
with a correct signal realization indicating the electoral situation (Equation (6)). 


TABLE 2. Inverted Election Results: English Constituencies 1997 


whose preferred party came third expected it to come second. This coincides rather closely 
with the analysis of Section 6.2. It would appear, therefore, that a situation in which voters’ 
beliefs are somewhat “fuzzy,” due to significant community-specific effects in preferences, 
generates (at first blush) the kind of results observed in survey data. 


Employing these ideas, I turn to the data displayed in Figure 4. As a calibration exercise, I 
consider the 270 constituencies where the Conservatives polled between 1/3 and 1/2 of the 
vote, and identify them as unpopular incumbents. 


For each constituency, I calculate appropriate values for 7 and p. For a variety of values of 
w (and hence m) I calculate b and, using Equation (7), “invert” the election result to obtain 
the notional true level of support for each challenging candidate. Using these notional levels 
of support, I then calculate the election results in the absence of strategic voting. 


Of course, this procedure is very much a “back-of-the-envelope” calibration exercise, and 
should been seen as nothing more than indicative. Nevertheless, the results are interesting, 


66The implicit assumption is that all supporters of the Labour and Liberal Democrat parties rank the 
Conservative party last. Of course, this is not actually the case, and in fact a significant proportion of 
Liberal Democrat voters ranked the Labour party last in 1997. Nevertheless, this simplification is used to 
generate a “ball-park” figure for the amount of strategic voting that is consistent with observed patterns. 
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and given in Table 2. For w = 0.2, so that community variation accounts for 20% of the 
idiosyncratic variation of voter preferences, the results suggest that the Conservative party 
lost 47 seats due to strategic voting towards the Labour and Liberal Democrat parties. 
Furthermore, this same parameter choice suggests that, according to the theory, strategic 
voting affected around 26% of the “risk population” of trailing candidate supporters. This 
matches well with the empirical estimates of Fisher (2000). He estimates (again using BES 
data) that 24.4% of votes in the risk population of third-party supporters voted strategically 
in the 1997 British General Election.®’ The overall incidence of strategic voting from a 
calibrated model appears to be about right. 


7. CONCLUDING REMARKS 


I began with the observation that many analyses of plurality-rule elections predict the com- 
plete coordination of strategic voting, and hence support for only two candidates: a strictly 
bipartite outcome. My suggestion that stable multi-candidate support will arise as part of a 
fulfilled-expectations equilibrium follows from the weakening of a single assumption: I have 
removed the common knowledge of the electoral situation. In doing so, I have attempted to 
follow a suggestion of McKelvey and Ordeshook (1985, p. 57) by investigating “the extent 
to which the equilibria of systems with limited information corresponds to the equilibria of 
systems with full information.” This investigation yields a uniquely stable partially coordi- 
nated voting equilibrium in which there is some, but not complete, strategic vote-switching 
toward a leading challenger. In the limit, as voters’ information becomes precise, this equi- 
librium converges to one of the strictly Duvergerian equilibria described by Palfrey (1989) 
and Myerson and Weber (1993); nevertheless, the community-communication story suggests 
that, in some settings, there will be a limit to the precision of information available to voters, 
and hence a bound to Duverger’s psychological effect. 


Formally, the equilibrium described here is radically different from the non-Duvergerian 
equilibria that were important to Cox (1994). Despite this, the uniquely stable partially 
coordinated equilibrium incorporates precisely the informal (and insightful) intuition that 
he offered. For instance, Cox (1997, p. 86) interpreted the result for the Ross and Cromarty 
constituency (where the Liberal and Labour candidates almost tied, permitting a Conserva- 
tive win) from the 1970 UK General Election as a potential non-Duvergerian equilibrium. He 
claimed that this interpretation required that “it was not clear who was in third and who in 
second” and that “neither [challenger] suffered from strategic desertion.” These are exactly 
opposite to the requirements of non-Duvergerian equilibria which require exact knowledge of 
the gap between parties and a precisely calculated strategic swing. This intuition, however, 


6’This impact figure was calculated across all English constituencies and not just those considered here. 
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is entirely consistent with the partially coordinated equilibrium described here: voters are 
uncertain of the ranking of challenging candidates. 


I conclude by noting that this paper offers insights and predictions that differ somewhat from 
casual intuition; for instance, I predict the negative-feedback effect, and the rejection of the 
marginality hypothesis described by Cain (1978) and others. While I have made no attempt 
to conduct an empirical study, some of the predictions do seem consistent with the outcomes 
of UK elections; moreover, related work takes some of the predictions to the data, and they 
are not rejected. This is in contrast to the claim of Green and Shapiro (1994, p. 6) that 
rational-choice theory does “little more than restate existing knowledge in rational-choice 
terminology.” It may be argued that results offered here yield a response to this critique. 


APPENDIX A. OMITTED PROOFS 


Proof of Lemma 1. By Assumption 2, and conditional on 7, the sufficient-statistic signal 6; 
and the log-relative preference are joint normal. Consider the construction of an alternative 


signal 6; = ad; + (1 — a)ai; for some a € R. This satisfies E[d; |] = 7, and has variance 


var[); | q] = a?x? + (1 — a)2€? + 2a(1 — a) cov[6;, &; | 7]. Choosing a to minimize this, 


Ovar|d; | 7] 
Oa 


If 6; is a sufficient statistic, then this must hold at a = 1. This yields cov[d;, &; | 7] = K?. 


= 2oKn? — 2(1 —a)é* + 2(1 = 2a)cov[d;, &; | m] = 0. 


Proof of Lemma 2. Follows from the argument presented in the main text. 


Proof of Lemma 3. I need to show that p takes on all values in (0,1) interval with positive 
density. First, v(d;,u;) = 1 for some pair 6; and u;, by assumption. By monotonicity, it 
retains this value for larger 6; and w,. It follows that E/v(6;, a) |] — 1 as 7 — oo. Similarly, 
E|v(6;, u;) |] — 0 as 7 — —oo. The posterior belief over 7 yields a positive density on the 
real line, and hence f(p|6) has full support. Continuity follows straightforwardly from the 


properties of the expectation with respect to a continuous density. 


Sketch Proof of Proposition 1. Examine Equation (1) and notice that as n grows (and 7, > 
y) the integrand becomes increasingly peaked around a maximum at p = y. Equivalently, in 
a large electorate, candidate 1 can only match the required qualified majority when p = y. It 
follows that only density local to y is relevant in the integral, and hence I may replace f(p | 06) 
with f(7|6). Bringing this outside the integral, the remaining expression (a Beta density) 
integrates to 1/(n + 1), yielding Proposition 1. For a rigorous proof of this proposition, see 


the not-for-publication Appendix C. 
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Proof of Lemma 4. Suppose that all voters use a monotonic voting strategy exhibiting strict 
multi-candidate support. Define p = Elv(6;, a) |] = H(n). This is strictly and smoothly 
increasing in 7. Write h(n) = H'(7). Following the receipt of a signal 6, 
AH~*(p) — Eln| 4] 

var(7 | 6] 


F(p|6) = Prin < Hp) |] = © ( 


where ®(-) is the cumulative distribution function of the normal. This uses the fact that 
private posterior beliefs over 7 are normal. Differentiating, 


1 H~'(p) — E[n| 6 
Ve 
FON)" a) oat ( carta 


Turning to the strategic incentive A, 


oe £0119) _ yg MC =a)) HG) = Binal? , (HG = 1) = Bln) 
Fa —y18) nH) ei Darl 1) 

gg MHD), HI =P = 1)? | HG) = Ho 9) 

lesa) Parla [5 Pe ae 


E{n| 46] = 6 and var|7|4] does not depend on 4, and so 4 is linear in 6 as required. 


Proof of Lemma 5. Under linear strategies, individual 7 votes for candidate 1 whenever 7 + 
a+ bd; + ¢; > 0, or equivalently a + (1 + b)n > —e; — 6(6; — 7). Conditional on 7, this last 
term is normally distributed with zero expectation, and variance K? = var[e; + (6; — 7)]. 
Thus the probability p of a vote for candidate 1 satisfies 


a+(1+))n 7 K®(p) —a 
p = H(n) ( 2 > 7 (p) a 
where ®(-) is the cumulative distribution function of the standard normal. Differentiating, 
1+b,fa+(1+b Pet _ 
h(n) = Hi(n) = Fp (SEEM) 5 nr) = S67). 


Begin with the first term of Equation (9). First employ the symmetry of the normal distri- 


bution to note that ®~1(1 — y) = —®~'(y), and that ¢(z) = ¢(—z). It follows that 
h(H~(1 — 7) p(@*(1 — 7) 
= = 0. 
REG) (7G) 
Next consider the second term of Equation (9), 
~1/,y2 _ (ROY) — a)? _ [RO7*(y)? + a? — 2akO7*(y) 
Similarly, 
(a _ ))? -_ [A®*(1 oH YI? +a = 2akO-*(1 = 7) = [AB (7)? a a? te zs aes) 


(1 + 6)? (1 + 6)? 


and so it follows that 
Via ets ee cee 2ak®~*(7) 


2var([n | 6] ~ -var[n | ](1 + 6)?" 


The final term is simply 
A(y)-HOV1-y 2KOT"(y)(1+6 
(7) ( in| 6) = Ok ) 
(1 + b)?var{[n | 4] 
Assemble these terms and, recalling that E[7 | 6 = 6] and var[n| 6] = «?, obtain: 


E[n | 4]. 


var[n | 6| 


f(y|9) ~ 2kO-"(y)(a+ (1+b)d) — 2kO7"(y)a | 2K" (y)6 
f(l— 7] 6) K?(1 + b)? — Ke(L +6)? © K2(1 +5) 
_ ab(b) » a, 26B-1(4) 

= A40) +béd where b(b) = 


This yields the desired intercept parameter. For the slope parameter, 


log 


k? = varle; + b(5; — 7) |n] = €? + WK? + 2hbeov[6,, ti; | q] = €2 + (0? + 2b)K? 
where the last step follows from Lemma 1. Substituting, this yields 


. (7) \/ 2 + (0? + 2b)K? 


Be) K2(1+ 0) 
For the final part of the proof, differentiate this with respect to b: 
i(b) = 26-1(y) (1 + b)K? Wes (b? + 2b)K 
mR | (140) JE + (BP + 2)K? (1 + 6)? 
ly) VEZ + (0? + 2b) K? (1+b)x? 
«(1+ 6) £24 (b? + 2b)K2 140 


3 (1 + d)K? 1 
= 10) ate = a 


For b > 0, this derivative is negative if and only if 
(1+0)*6* (b? +. 26+. 1)x? 


2 - ne Ot caer 
€2 + (627 +2b)K2 £2 + (b? + 2b) Ke? 


K2(1 +0) 
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which holds since €? is an upper bound to the variance of a voter’s signal. Finally, evaluate 


b(b) at b = 0 and as b > oo to obtain the final statements of the lemma. 


Proof of Lemma 6. By inspection of Equation (2). 


The following lemma yields the proof of Proposition 2. 
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Lemma 7. The mapping b(b) has a unique fixed point b* > 0. This is bounded as follows 


20-1() < mbt <O-1(y), [24 | irra 


(10) 


where Kb* attains the upper bound as Kk? —> 0. 


Proof of Lemma 7. The uniqueness of 6* follows from the fact that b(b) is decreasing. To 
obtain an upper bound for the fixed ie b*, write b(b) as 
» 24 (b? + 2b)K 
iis UV S 
K+ bK 
Make the change of variable @ = Kb to obtain 
, 207*(y) VE? + B? + 2K 8 2 C+ B+ 2K8 
aa) = = = 2071(y))/ Ene 
K+ k2 + 67+ 260 
It is clear that the right-hand side is decreasing in &. Hence sending x | 0, 
; 7h ed Oh seat 
B(B) < 3 


To obtain the desired upper bound from this, solve the equation 34—(2®~1(7))?(€?+ 87) = 0 
This is quadratic in 3?, and may be solved to obtain the positive root 


ge — 22“)? + VOTO + AREAGIPE? _ 287(0))? fh pe Ss 
7 y)  (@- 


2 


It follows that an upper bound is 


and moreover, it is clear that this bound is attained as « | 0. 


Proof of Proposition 2. From Lemma 7 and the argument given in the main text. 


The following lemma yields the proof of Proposition 3. 
Lemma 8. b* is globally stable in the iterative best response dynamic: b, — b* as t > oo. 
Proof of Lemma 8. Consider the mapping B(b) = 6)(b) = 6(b(b)). Notice that b* = 6(b*) 


is also a fixed point of B. Taking the derivative B’(b) = b'(b(b))b’(b), it follows that this is 
an increasing function, since 6! < 0. Consider a generic fixed point b, satisfying B(b) = b. 


Al 


Evaluate the derivative at this fixed point. This satisfies 


; be? + Ke? 1 ~ bk? + K? 1 
B'(b) =b = —— — =| x _ 
€2 + (b2 4+ 2b)K2 1 +b E24 (24+ 2b)K2 145 
= (+ b)K? ib ee | (+ b)K? ib | 
| ee + (82 428)n2 148 E24 (bP Qb)K2 1 +BY 


Both terms are less than one, and hence B’(b) < 1 at a fixed point. It follows that any fixed 
point must be a downcrossing. Further fixed points would require an upcrossing, and hence 
there is a unique fixed-point b*. From this it follows that b, — b*. To see this, notice that 


bi+2 = B(b,). From the properties of B, there is the required convergence. 


Proof of Proposition 3. Given that b, — 6*, it follows that bj4,/(1+0;) — b*/(1+0*). From 
this, the global stability of a* is immediate: a,4;/a; > b*/(1+0*) < 1 and so a — 0. As 
the iterative best-response process continues, any “bias” toward a particular candidate is 


eliminated. Combining this observation with Lemma 8 completes the proof. 


Proof of Proposition 4. The effects of y and m follow immediately from Lemma 6. The 
expected strategic incentive is bn = b€®~'(7) which is increasing in 7. For the effect of €?, 


, db* Ob(b*) x, db* db* 1 db(b*) 
bY = b(0" = + b'(b* a Seas 
e) dé Og e) dg d€é 1—b(b*) O€ 
Equation (3) yields 06(b*)OE = —b(b*)/€ = —b* /E. Combine these two expressions to obtain 
iat ree | eee ee (AC 
dg dg 1 — W/(b*) 1 — b'(b*) 


The last inequality follows from 6’(b*) < 0 (since 6(b*) is decreasing) and b/(b*) > —1 (since 
b* is a stable fixed point, from Lemma 8). It remains to consider the behavior of strategic 
incentives as m — oo. The first equality is a restatement of Equation (4). The limiting 
behavior of the equilibrium strategic incentive as m — oo is equivalent to that when K? —> 0. 
The proof to Lemma 7 demonstrates that «b* attains the upper bound of Equation (10) as 


k — 0. This completes the proof. 


Proof of Proposition 5. From the discussion in the text and properties of b*. 


Proof of Proposition 6. By inspection ¥ is increasing in both w and d and 7 is increasing in 


d. Differentiation of 7 with respect to w demonstrates that 7 is also increasing in w. 
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APPENDIX B. OMITTED MATERIAL FROM SECTIONS 5-— 6. 


B.1. Refining the Impact Measures. When calculating the impact of strategic voting 
(via Equation (6)) I “equipped” a voter with an accurate signal realization 6; = 7, and then 
calculated p = ®((1 + b)®~1(z)). A slightly different result is obtained when I consider her 
behavior conditional on her actually receiving a signal realization 6; = 7. This is because 
her preferences form part of the signal, and hence, conditional on 6; = 7, it is no longer the 
case that 1; ~ N(n, €?). In fact, Bayesian updating following observation of 6; = 7 yields 


2(m — [fm (1+ 


which is a slight modification to Equation 6. 


B.2. Specification of the Idiosyncrasy Parameter €’. €* may be specified in a number 
of different ways. One way is to set two different quantiles for voters with different intensity 
of preference. For instance, a voter prefers candidate 1 when u; > uz, and hence I may 
define 7(1) = Pr[u; > u2] = ®(n/E). A voter prefers candidate 1 k as much as candidate 2 
(relative to the disliked status-quo) when u, > kug = t > logk, yielding m(k) = Priu, > 
kug] = ®((n — logk)/€). These two equations are sufficient to tie down 7 and €?: 

m(k) = O(n —logk)/€) = €®'(n(k)) = — log k = €O"'(n(1)) — logk 

log k 
®-1(m(1)) — ®-Ma(k)) 
To see this formula in action, consider an electorate where a fraction 7 = a(1) = 0.6 of 


the electorate rank candidate 1 highest, and half of these (or a fraction 7(2) = 0.3 of the 
electorate) prefer candidate 1 twice as much as candidate 2. Then, 


— 


log 2 9 
= =0.891 => = 0.794. 
é 6-1(0.6) — &-1(0.3) : 


Using this formula, €? may also be specified by considering the median supporter of candidate 
1. If the median supporter prefers candidate 1 k times as much as candidate 2, then (by defini- 
tion of being the median in this group) m(k) = 7(1)/2. Hence € = logk/ [®~1!(z) — ®~'(1/2)]. 
When 7 = 1/2 this formula reduces to € = — log k/®~1(1/4). Ilustrating this last formula- 
tion, consider a balanced electorate in which 7 = 1/2. Suppose that the median supporter 
of candidate 1 prefers her favored candidate twice as much as candidate 2. Then € = 1.028 
and €? = 1.056. For k = 1.5 and k = 2.5 the outcomes are € = 0.6 and € = 1.36 respectively. 
Hence a range of 0.5 < € < 1.5 might seem appropriate for the idiosyncrasy parameter. 


B.3. Calibration of the UK General Election of 1997. I considered only the 529 
English constituencies. The Speaker’s seat and Tatton (where the major parties pulled out 
as part of an anti-sleaze protest) are excluded, yielding 527 remaining constituencies. The 
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three leading candidates were then the three major political parties. Writing Wo for the 
Conservative vote share among these three, I excluded Wp > 1/2 (a certain Tory win) and 
Wo < 1/3 (a certain Tory loss). Within the remaining 270 constituencies I labelled the 
leading challenger as candidate 1 and the trailing challenger as candidate 2. This led to 
7 = Wo/(1— Wo) and p = o,/(W1 + V2). Using m = 1/w, I calculated the response parameter 
b. I then employed Equation (7). 


APPENDIX C. NOT-FOR-PUBLICATION 


L 
2 


(5) = Pi =v) TF@) a 
Jo PV(1 — p)-A]r dp 


where f (p) is a continuous density taking strictly positive values on (0, 1), following Lemma 3. 


Proof of Proposition 1. I introduce the parameter y where = < Y < 1 and define 


Dependence on 6 is suppressed for simplicity. I next introduce the notation G(p), 

a 2 1 - 
ple)? Jo Gp)" F(p) dp 
Vilage Jo G(p)" dp 
G(p) is increasing from G(0), attaining a maximum of G'(7) = 1 at p = 4, and then declining 
back to G(1) = 0. Next I fix min{1/4,1 —y} > € > 0. For convenience, I now define 
f-(@) = max,_-ccp<r+e f(p) < 00, where this is well defined since [x — €, x + €] is a compact 


G(p) = = ES 


set and f(p) is continuous from Lemma 3. I formulate an upper bound for the ratio r(4), 


FAH) RE GO)" dp + Foe A) [J75, Cl)" dp + J222 Gy" ap 


cys F+e 
Je-. G(p)" dp 
_ FG = 20) GG = 26)" + (= FG + 26))GG + 26)” 
5. G(p)” dp 


The right hand side of this equation has five terms. I consider them in turn. First, 
F(4) [2° Gp)” dp 
J3"5 G(p)" dp 


Next, the denominator of the second term. G(p) is increasing from 7 — € to Y and hence 


= f.(4). 


|  Glpy" dp > ” Glp)" dp > Ga — 0)", 


Ze A 


ye 
so that taking the second term, and allowing n — oo, it follows that 
Fol) we G(p)” dp e Fadl) ie 
aes G(p)” dp —€ 7 


G n 
aes dp—0 as n—- ov, 
oo) 
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which holds since G(p) < G(¥ — €) for all p < 7 — 2e. An identical argument ensures that 
the third term vanishes. A similar argument applies to the fourth term 

F(¥ — 2€)G(4 — 2€)” is F(4 — 2e) be — 2e 

Gp)" dp € Gly =") 


y-€ 


—-0 a n-o, 


with a symmetric argument for the fifth term. I conclude that lim,...7(¥) < f,(7). Notice 
now that € may be chosen arbitrarily small. It follows that 
Jim r(9) < lim f.(7) = f(4). 

A similar procedure bounds the limit below, and hence r(¥) > f(7). Next, I construct a 
compact interval [y — 7, y +7] around 7, for small 7. For 7 € [y —7,y +7], the argument 
above establishes that r(7) — f(7) pointwise on this interval. But since r(¥) and its limit 
are continuous, and the interval is compact, it follows that this convergence is uniform. Now, 
recall that 7, = [yn]/n, and hence y, — y. It follows that y, € [y—1,7+7] for sufficiently 
large n. For sufficiently large n, r(y,) is arbitrarily close to f(y,). By taking 7 sufficiently 
small, it is assured that this is arbitrarily close to f(y), which follows from the continuity of 
f. Finally, to complete the proof, I note that 


; [wa — p)-*]" dp = — Pin+2) fopr(L—p)* dp 
This follows from spotting the density of the Beta distribution with parameters y7,,n and 


n—yn. It then follows that 


Jo lp" =p) 71" F@) dp 
nt+1)q = 25 : = r(Yn) > f(y): 
(n+ 1)q fd —p I ap (In) > Fy) 


From this the result for (n + 1)q, follows, with a similar approach for (n + 1)qo. 
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